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METHODS FOR IDENTIFYING COMPOUNDS OF INTEREST 
USING ENCODED LIBRARIES 



Related Applications 

This application claims priority to U.S. Provisional Application No. 
60/731,464, filed October 28, 2005. This application is related to U.S. Patent 
Application No. 60/689,466, filed June 9, 2005, pending, and U.S* Patent Application 
No. 11/015458 filed December 17, 2004. This application is also related to U.S. 
Provisional Patent Application Serial No. 60/530,854 , filed on December 17, 2003; 
U.S. Provisional Patent Application Serial No. 60/540,681, filed on January 30, 2004; 
U.S. Provisional Patent Application Serial No. 60/553,715 filed March 15, 2004; and 
U.S. Provisional Patent Application Serial No. 60/588,672 filed July 16, 2004. The 
entire contents of each of the foregoing applications are incorporated herein by 
reference. 

Background of the invention 

The search for more efficient methods of identifying compounds having useful 
biological activities has led to the development of methods for screening vast nimibers 
of distinct compounds, present in collections referred to as combinatorial libraries. 
Such libraries can include 10^ or more distinct compounds. A variety of methods 
exist for producing combinatorial libraries, and combinatorial syntheses of peptides, 
peptidoroimetics and small organic molecules have been reported. 

The two major challenges in the use of combinatorial approaches in drug 
discovery are the synthesis of libraries of sufficient complexity and the identification 
of molecules which are active in the screens used. It is generally acknowledged that 
greater the degree of complexity of a library, the number of distinct structures 
present in the library, the greater the probability that the library contains molecules 
with the activity of iuterest Therefore, the chemistry employed in library synthesis 
must be capable of producing vast numbers of compoxmds within a reasonable time 
fi-ame. However, for a given formal or overall concentration, increasing the mmiber 
of distinct members within the library lowers the concentration of any particular 
library member. This complicates the identification of active molecules from high 
complexity libraries. 
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One approach to overcoming these obstacles has been the development of 
encoded Ubraries, and particularly Ubraries in which each compound includes an 
amplifiable tag. Such libraries include DNA-encoded libraries, in which a DNA tag 
identifying a library member can be amplified using techniques of molecular biology, 
5 such as the polymerase chain reaction. However, the use of such methods for 

producing very large libraries is yet to be demonstrated, and it is clear that improved 
methods for producing such Ubraries are required for the realization of the potential of 
this approach to dmg discovery. 



1 0 Summary of the in vention 

Traditional drug discovery methods have relied on multi-step selection 
processes, often involving the amplification (e.g., PGR amplification) of nucleic acid 
. molecules, and the sequencing of up to 1,000 or more of the top clones. This multi- 
step selection process and the nucleic acid amplification often lead to the introduction 

15 of many biases (as discussed in, for example. Holt, L. J., et aL (2000) Nucleic Acids 
Res. 28(15):E72). The presence of these biases typically leads to the selection of 
compoimds that lack the desired biological activity. • 

The present invention provides improved methods as compared to the prior art 
methods in that it provides methods which eliminate the foregoing biases. For 

20 example, the present invention provides methods of identifying a compound of 

interest using a massively parallel sequencing approach which leads to the accurate 
identification of a compoxmd with a desired biological activity using fewer selection 
steps. Moreover, as described herein, a unique tag^ng system has been developed 
that eliminates biases introduced by nucleic acid amplification, e.g., PGR 

25 amplification. In addition, the methods described herein allow for an expansive and 
extensive analysis of the selected compounds having a desired biological property, 
which, in turn, allows for related compounds with familial structural relationships to 
be identified (structure activity relationships). In sunamary, using the methods of the 
invention, a single step selection/enrichment cycle can be performed and then 

30 sequencing can be perfomied at the single molecule level, preferably without the need 
for any nucleic acid amplification. 

Accordingly, in one aspect, the invention provides a method for identifying 
one or more compoimds which bind to a biological target. The method comprises 
synthesizing a library of compounds, wherein the compounds comprise a functional 
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moiety comprising two or more building blocks which is operatively linked to an 
initial oligonucleotide which identifies the structure of the functional moiety by 
providing a solution comprising m initiator compounds, wherein m is an integer of 1 
or greater, where the initiator compounds consist of a functional moiety comprising n 
5 building blocks, where n is an integer of 1 or greater, which is operatively linked to an 
initial oligonucleotide which identifies the n building blocks, dividing the solution 
described above into r reaction vessels, wherein r is an integer of 2 or greater, thereby 
producing r aliquots of the solution, reacting the initiator compounds in each reaction 
vessel with one of r building blocks, thereby producing r aliquots comprising 

10 compounds consisting of a functional moiety comprising n+1 building blocks 
operatively liiiked to the imtial oUgonucleotide, and reacting the initial 
joligonucleotide in each aliquot with one of a set of r distinct incoming 
oligonucleotides in the presence of an enzyme which catalyzes the ligation of the 
incoming oligonucleotide and the imtial oUgonucleotide, under conditions suitable for 
t 15 / enzymatic ligation of the incoming oligonucleotide and the initial oligonucleotide; 
thereby producing r aliquots of molecules consisting of a functional moiety 
comprising n+1 building blocks operatively linked to an elongated oligonucleotide 
which encodes the n+1 building blocks; contacting the biological target with the 
library of compounds, or a portion thereof, under conditions suitable for at least one 

20 member of the library of compounds to bind to the target, removing library members 
that do not bind to the target, sequencing the encoding oligonucleotides of the at least 
one member of the Ubrary of compounds which binds to the target, and using the 
foregoing sequences to determine the structure of the functional moieties of the 
members of the library of compounds which bind to tibie biological target, thereby 

25 identifying one or more compounds which bind to the biological target 

In one embodiment, the methods of the invention may further comprise 
amplifying the encoding oligonucleotide of the at least one member of the library of 
compounds which binds to the target prior to sequencing. 

In one embodiment, the method of amphfying comprises forming a water-in- 

30 oil emulsion to create a plurality of aqueous microreactors, wherein at least one of the 
microreactors comprises the at least one member of the library of compounds that 
binds to the target, a single bead capable of binding to the encoding oligonucleotide of 
the at least one member of the library of compounds that binds to the target, and 
amplification reaction solution containing reagents necessary to perform nucleic acid 
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amplification, amplifying the encoding oligonucleotide in the microreactors to form 
amplified copies of the encoding oligonucleotide, and binding the amplified copies of 
the encoding oligonucleotide to the beads in the microreactors. 

In one embodiment, the method of sequencing comprises annealing an 
5 effective amoimt of a sequencing primer to the amplified copies of the encoding 
oligonucleotide and extending the sequencing primer with a polymerase and a 
predetermined nucleotide triphosphate to 3deld a sequencing product and, if the 
predetermined nucleotide triphosphate is incorporated onto a 3' end of the sequencing 
primer, a sequencing reaction byproduct, and identifying the sequencing reaction 
10 byproduct, thereby determining the sequence of the encoding oligonucleotide. 

In one embodiment, sequencing is performed using the polymerase chain 
reaction. In another embodiment, sequencing is performed using a pyrophosphate 
sequencing method or using a single molecule sequencing by synthesis method. 

15. ' Brief description of the drawings • r 

Figure 1 is a schematic representation of ligation of double stranded 
oligonucleotides, in which the initial oligonucleotide has an overhang which is 
complementary to the overhang of the incoming oligonucleotide. The initial strand is 
represented as either free, conjugated to an aminohexyl Unker or conjugated to a 
20 phenylalanine residue via an aminohexyl linker. 

Figure 2 is a schematic representation of oUgonucleotide ligation using a splint 
strand. In this embodiment, the splint is a 12^mer oligonucleotide with sequences 
complementary to the single-stranded initial oligonucleotide and the single-stranded 
incoming oligonucleotide. 
25 Figure 3 is a schematic representation of ligation of an initial oligonucleotide 

and an incoming oligonucleotide, when the initial oUgonucleotide is double-stranded 
with covalently Unked strands, and the incoming oligonucleotide is double-stranded. 

Figure 4 is a schematic representation of oligonucleotide elongation using a 
polymerase. The initial strand is represented as either free, conjugated to an 
30 aminohexyl linker or conjugated to a phenylalanine residue via an aminohexyl linker- 
Figure 5 is a schematic representation of the synthesis cycle of one 
embodiment of the invention. 

Figure 6 is a schematic representation of a multiple round selection process 
using the libraries of the invention. 

-4- 
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Figure 7 is a gel resulting from electrophoresis of the products of each of 
cycles 1 to 5 described in Example 1 and following ligation of the closing primer. 
Molecular weight standards are shown in lane 1, and the indicated quantities of a 
hyperladder, for DNA quantitation, are shown in lanes 9 to 12. 

Figure 8 is a schematic depiction of the coupling of building blocks using 
azide-alkyne cycloaddition. 

Figures 9 and 10 illustrate the coupling of building blocks via nucleophilic 
aromatic substitution on a chlorinated triazine. 

Figure 1 1 shows representative chlorinated heteroaromatic structures suitable 
for use in the synthesis of functional moieties. 

Figure 12 illustrates the cyclization of a Imear peptide using the azide/alkyne 
cycloaddition reaction. 

Figure 13a is a chromatogram of the hbrary produced as described in Example 
2 foUwing Cycle 4. 

Figure 13b is a mass spectrum of the library produced as described in Example 
. 2 following Cycle 4. 

Detailed description of the invention 

The present invention relates to methods of producing compounds and 
combinatorial compound libraries, the compounds and libraries produced via the 
methods of the invention, and methods of using the libraries to identify compoimds 
having a desired property, such as a desired biological activity. The invention further 
relates to the compoimds identified using these methods. 

A variety of approaches have been taken to produce and screen combinatorial 
chemical libraries. Examples include methods in which the individual members of 
the library are physically separated from each other, such as when a single compound 
is synthesized in each of a multitude of reaction vessels. However, these Ubraries are 
typically screened one compound at a time, or at most, several compounds at a time 
and do not, therefore, result in the most efficient screening process, hi other 
methods, compounds are synthesized on solid supports. Such solid supports include 
chips in which specific compounds occupy specific regions of the chip or membrane 
("position addressable"). In other methods, compounds are synthesized on beads, 
with each bead containing a different chemical structure. 
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Two difficulties that arise in screening large libraries are (1) the number of 
distinct compounds that can be screened; and (2) the identification of compounds 
which are active in the screen. In one method, the compounds which are active in the 
screen are identified by narrowing the original library into ever smaller JBractions and 
5 subfi*actions, in each case selecting the fraction or subfraction which contains active 
compounds and further subdividing until attaining an active subfraction which 
contains a set of compounds which is sufficiently small that all members of the subset 
can be individually synthesized and assessed for the desired activity. This is a tedious 
and time consiraiing activity. 

1 0 Another method of deconvoluting the results of a combinatorial library screen 

is to utilize libraries in which the library members are tagged with an identifying 
label, that is, each label present in the library is associated with a discreet compoxmd 
structure present in the library, such that identification of the label tells the structure 
of the tagged molecule. One approach to tagged libraries utilizes oligonucleotide 
. 15 tags, as described, for example, in US Patent Nos. 5,573,905; 5,708,153; 5,723,598, 
6,060,596 published PCT applications WO 93/06121; WO 93/20242; WO 94/13623; 
WO 00/23458; WO 02/074929 and WO 02/103008, and by Brenner and Lemer {Proc. 
Natl Acad, Set USA 89, 5381-5383 (1992); Nielsen and Janda {Methods: A 
Companion to Methods in Enzymology 6, 361-371 (1994); and Nielsen, Brenner and 

20 Janda (J. Am. Chem. Soc. 115, 9812-9813 (1993)), each of which is incorporated 
herein by reference in its entirety. Such tags can be amplified, using for example, 
polymerase chain reaction, to produce many copies of the tag and identify the tag by 
sequencing. The sequence of the tag then identifies the structure of the binding 
molecule, which can be synthesized in pure form and tested. To date, there has been 

25 no report of the use of the methodology disclosed by Lemer et al, to prepare large 
libraries. The present invention provides an improvement in methods to produce 
DNA-encoded libraries, as well as the first examples of large (10^ members or 
greater) libraries of DNA-encoded molecules in which the fimctional moiety is 
synthesized using solution phase synthetic methods. 

30 The present invention provides methods which enable facile synthesis of 

oligonucleotide-encoded combinatorial libraries, and permit an efficient, high-fidelity 
means of adding such an oligonucleotide tag to each member of a vast collection of 
molecules. 



-6- 
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The methods of the invention include methods for synthesizing bifimctional 
molecules which comprise a first moiety ("functional moiety") which is made up of 
building blocks, and a second moiety operatively linked to the first moiety, 
comprising an oligonucleotide tag which identifies the structure of the first moiety, 
i.e., the oligonucleotide tag indicates which building blocks were used in the 
construction of the first moiety, as well as the order in which the building blocks were 
linked. Generally, the information provided by the oligonucleotide tag is sufficient to 
determine the building blocks used to construct the active moiety. In certain 
embodiments, the sequence of the oligonucleotide tag is sufficient to detemiine the 
arrangement of the building blocks in the fimctional moiety, for example, for peptidic 
moieties, the amino acid sequence. 

The term "fimctional moiety" as used herein, refers to a chemical moiety 
comprising one or more building blocks. Preferably, the building blocks in the 
functional moiety are not nucleic acids. The functional moiety can be a linear or 
branched or cyclic poljoner or oligomer or a small organic molecule. 

The term **building block", as used herein, is a chemical stractural unit which 
is linked to other chemical structural units or can be linked to other such units. When 
the functional moiety is polymeric or oligomeric, the building blocks are the 
monomeric units of the polymer or oligomer. Building blocks can also include a 
scaffold structure ("scaffold building block") to which is, or can be, attached one or 
more additional structures ("peripheral building blocks"). 

It is to be understood that the term "building block" is used herein to refer to a 
chemical structural unit as it exists in a functional moiety and also in the reactive form 
used for the synthesis of the functional moiety. Within the functional moiety, a 
building block will exist without any portion of the building block which is lost as a 
consequence of incorporating the building block into the functional moiety. For 
example, in cases in which the bond-forming reaction releases a small molecule (see 
below), the building block as it exists in the functional moiety is a *l3uilding block 
residue", that is, the remainder of the building block used in the synthesis following 
loss of the atoms that it contributes to the released molecule. 

The building blocks can be any chemical compounds which are 
complementary, that is the building blocks must be able to react together to form a 
structure comprising two or more building blocks. Typically, all of the building 
blocks used will have at least two reactive groups, although it is possible that some of 
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the building blocks (for example the last building block in an oligomeric functional 
moiety) used will have only one reactive group each. Reactive groups on two 
different building blocks should be complementary, Le,, capable of reacting together 
to form a covalent bond, optionally with the concomitant loss of a small molecule, 
5 such as water, HCl, HF, and so forth. 

For the present purposes, two reactive groups are complementary if they are 
capable of reacting together to form a covalent bond. In a preferred embodiment, the 
bond forming reactions occur rapidly xmder ambient conditions without substantial 
formation of side products. Preferably, a given reactive group will react with a given 

10 complementary reactive group exactly once. In one embodiment, complementary 
reactive groups of two building blocks react, for example, via nucleophilic 
substitution, to form a covalent bond. In one embodiment, one member of a pair of 
complementary reactive groups is an electrophilic group and the other member of the 

: pair is a nucleophilic group. 

1 5 Complementary electrophilic and nucleophilic groups include any two groups 

which react via nucleophilic substitution under suitable conditions to form a covalent 
bond. A variety of suitable bond-forming reactions are known in the art. See, for 
example, March, Advanced Organic Chemistry, foxirth edition. New York: John 
Wiley and Sons. (1992), Chapters 10 to 16; Carey and Sundberg, Advanced Organic 

20 Chemistry, Part B, Plenum (1990), Chapters 1-11; and CoUman et al.. Principles and 
Applications of Organotransition Metal Chemistry, University Science Books, Mill 
Valley, Calif (1987), Chapters 13 to 20; each of which is incorporated herein by 
reference in . its entirety. Examples of suitable electrophilic groups include reactive 
carbonyl groups, such as acyl chloride groups, ester groups, including carbonyl 

25 pentafluorophenyl esters and succinimide esters, ketone groups and aldehyde groups; 
reactive sulfonyl groups, such as sulfonyl chloride groups, and reactive phosphonyl 
groups. Other electrophilic groups include terminal epoxide groups, isocyanate 
groups wA alkyl halide groups. Suitable nucleophilic groups include primary and 
secondary amino groups and hydroxyl groups and carboxyl groups. 

30 Suitable complementary reactive groups are set forth below. One of skill in 

the art can readily determine other reactive group pairs that can be used in the present 
method, and the examples provided herein are not intended to be limiting. 

In a first embodiment, the complementary reactive groups include activated 
carboxyl groups, reactive sulfonyl groups or reactive phosphonyl groups, or a 
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combination thereof, and primary or secondary amino groups. In this embodiment, 
the complementary reactive groups react under suitable conditions to form an amide, 
sulfonamide or phosphonamidate bond. 

In a second embodiment, the complementary reactive groups include epoxide 
groups and primary or secondary amino groups. An epoxide-contaiiiing building 
block reacts with an amine-containing building block under suitable conditions to 
form a carbon-nitrogen bond, resulting in a B-amino alcohol. 

In another embodiment, the complementary reactive groups include aziridine 
groups and primary or secondary amino groups. Under suitable conditions, an 
aziridine-containing building block reacts with an amine-containing building block to 
form a carbon-nitrogen bond, resulting in a 1,2-diamine. In a third embodiment, the 
complementary reactive groups include isocyanate groups and primary or secondary 
amino groups. An isocyanate-containing building block will react with an amino- 
containing building block under suitable conditions to form a carbon-nitrogen bond, 
resulting in a urea group. 

In a fourth embodiment, the complementary reactive groups include 
isocyanate groups, and hydroxyl groups.An isocyanate-containing building block will 
react with an hydroxyl-containing building block under suitable conditions to form a 
carbon-oxygen bond, resulting in a carbamate group. 

In a fifth embodiment, the complementary reactive groups include amino 
groups and carbonyl-containing groups, such as aldehyde or ketone groups. Amines 
react with such groups via reductive amination to form a new carbon-nitrogen bond.. 

In a sixth embodiment, the complementary reactive groups include 
phosphorous yUde groups and aldehyde or ketone groups, A phosphorus-ylide- 
containing building block will react with an aldehyde or ketone-containing building 
block xmder suitable conditions to form a carbon-carbon double bond, resulting in an 
aDcene. 

In a seventh embodiment, the complementary reactive groups react via 
cycloaddition to form a cyclic structure. One example of such complementary 
reactive groups are alkynes and organic azides, which react under suitable conditions 
to form a triazole ring structure. An example of the use of this reaction to link two 
building blocks is illustrated in Figure 8. Suitable conditions for such reactions are 
known in the art and include those disclosed in WO 03/101972, the entire contents of 
which are incorporated by reference herein. 
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In an eighth embodiment, the complementaiy reactive groups are an alkyl 
halide and a nucleophile, such as an amino group, a hydroxyl group or a carboxyl 
group. Such groups react under suitable conditions to form a carbon-nitrogen (alkyl 
halide plus amine) or carbon oxygen (alkyl halide plus hydroxyl or carboxyl group). 
5 In a ninth embodiment, the complementary functional groups are a 

halogenated heteroaromatic group and a nucleophile, and the building blocks are 
linked imder suitable conditions via aromatic nucleophilic substitution. Suitable 
halogenated heteroaromatic groups include chlorinated pyrimidines, triazines and 
purines, which react with nucleophiles, such as amines, under mild conditions in 

10 aqueous solution. Representative examples of the reaction of an oligonucleotide- 
tagged trichlorotriazine with amines are shown in Figures 9 and 10. Examples of 
suitable chlorinated heteroaromatic groups are shown in Figure 11. 

Additional bond-forming reactions that can be used to join building blocks in 
the synthesis of the molecules and libraries of the invention include those shown 

1 5 below. The reactions shown below emphasize the reactive functional groups. 

Various substituents can be present in the reactants, including those labeled Ri, R2, R3 
and R4. The possible positions which can be substituted include, but are not limited, 
to those indicated by Ri, R2, R3 and R4. These substituents can include any suitable 
chemical moieties, but are preferably limited to those which will not interfere with or 

20 significantly inhibit the indicated reaction, and, unless otherwise specified, can 
include hydrogen, alkyl, substituted alkyl, aryl, substituted aryl, heteroaryl, 
substituted heteroaryl, alkoxy, aryloxy, arylaUcyl, substituted arylalkyl, amino, 
substituted amino and others as are known in the art. Suitable substituents on these 
groups include alkyl, aryl, heteroaryl, cyano, halogen, hydroxyl, nitro, amino, 

25 mercapto, carboxyl, and carboxamide. Where specified, suitable electron- 
withdrawing groups include nitro, carboxyl, haloalkyl, such as trifluoromethyl and 
others as are known in the art. Examples of suitable electron-donating groups include 
alkyl, alkoxy, hydroxyl, amino, halogen, acetamido and others as are known in the art. 

30 Addition of a primary amine to an alkene: 
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Nucleophilic substitution: 



\ 



Br 
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Reductive alkylation of an amine: 
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NaBH(OAc]^ 




NH + 
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R4 
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Palladiian catalyzed carbon-carbon bond forming reactions: 



S Br 
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OH 
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Electrophilic aromatic substitution reactions: 

YH 



10 




X is an electron-donating group. 



15 
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Imine/iminium/enamine forming reactions: 
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R3 R4 
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NH2 






R2 



Cycloaddition reactions: 



R2" 



Diels-Alder cycloaddition 



XYZ 



v 



R2 



1,3-dipolar cycloaddition, X-Y-Z = C-N-O, C-N-S, N3, 
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Nucleophilic aromatic substitution reactions: 
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R. 

I 

Rz N 



5 W is an electron withdrawing group 



w 






R2^ 



Y 





10 



Y Y 

Examples of suitable substituents X and Y include substituted or xmsubstituted 
amino, substituted or unsubstituted alkoxy, substituted or xmsubstituted thioalkoxy, 
substituted or unsubstituted aryloxy and substituted and unsubstituted thioaryloxy. 





HN 



SnCl2, 80°C H 
OMe 

Ri' 




R2 



Nv^^^^N R2 




NH 



15 
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Heck reaction: 
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Acetal formation: 




Examples of suitable substituents X and Y include substituted and 
unsubstituted amino, hydroxyl and suUiydryl; Y is a linker that connects X and Y and 
is suitable for forming the ring structure found in the product of the reaction 

10 

Aldol reactions: 




15 Examples of suitable substituents X include O, S and NR^. 



Scaffold building blocks which can be used to form the molecules and 
libraries of the invention include those which have two or more functional groups 
20 which can participate in bond forming reactions with peripheral building block 

precursors, for example, using one or more of the bond forming reactions discussed 
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above. Scaffold moieties may also be synthesized during construction of the libraries 
and molecules of the invention, for example, using building block precursors which 
can react in specific ways to form molecules comprising a central molecular moiety to 
which are appended peripheral functional groups. In one embodiment, a library of 
5 the invention comprises molecules comprising a constant scaffold moiety, but 
different peripheral moieties or different arrangements of peripheral moieties. In 
certain libraries, all library members comprise a constant scaffold moiety; other 
libraries can comprise molecules having two or more different scaffold moieties. 
Examples of scaffold moiety-forming reactions that can be used in the constmction of 
10 the molecules^ and libraries of the invention are set forth in the Table. The references 
cited in the table are incorporated herein by reference in their entirety. The groups Ri, 
R2, R3 and R4 are limited only in that they should not interfere with, or significantly 
inhibit, the indicated reaction, and can include hydrogen, alkyl, substituted alkyl, 
heteroalkyl, substituted heteroalkyl, cycloalkyl, heterocycloalkyl, substituted 
15 cycloalkyl, substituted heterocycloalkyl, aryl, substituted aryl, arylalkyl, 

heteroarylalkyl, substituted arylalkyl, substituted heteroarylalkyl, heteroaryl, 
substituted heteroaryl, halogen, alkoxy, aryloxy, amino, substituted amino and others 
as are known in the art. Suitable substituents include, but are not limited to, alkyl, 
alkoxy, thioalkoxy, nitro, hydroxyl, sulfhydryl, aryloxy, aryl-S-, halogen, carboxy, 
20 amino, alkylamino, dialkylamino, arylamino, cyano, cyanate, nitrile, isocyanate, 
thiocyanate, carbamyl, and substituted carbamyl. 

It is to be understood that the synthesis of a functional moiety can proceed via 
one particular type of coupling reaction, such as, but not limited to, one of the 
reactions discussed above, or via a combination of two or more coupling reactions, 
25 such as two or more of the coupling reactions discussed above. For example, in one 
embodiment, the building blocks are joined by a combination of amide bond 
foraiation (amino and carboxylic acid complementary groups) and reductive 
amination (amino and aldehyde or ketone complementary groups). Any coupling 
chemistry can be used, provided that it is compatible with the presence of an 
30 oligonucleotide. Double stranded (duplex) oUgonucleotide tags, as used in certain 
embodiments of the present invention, are chemically more robust than single 
stranded tags, and, therefore, tolerate a broader range of reaction conditions and 
enable the use of bond-forming reactions that would not be possible with single- 
stranded tags. 
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A building block can include one or more functional groups in addition to the 
reactivem group or groups employed to form the functional moiety. One or more of 
these additional functional groups can be protected to prevent undesired reactions of 
these functional groups. Suitable protecting groups are known in the art for a variety 
5 of functional groups (Greene and Wuts, Protective Groups in Organic Synthesis, 
second edition. New York: John Wiley and Sons (1991), incorporated herein by 
reference). Particularly useful protecting groups include t-butyl esters and ethers, 
acetals, trityl ethers and amines, acetyl esters, trimethylsilyl ethers,trichloroethyl 
ethers and esters and carbamates. 

10 In one embodiment, each building block comprises two reactive groups, which 

can be the same or different. For example, each buildmg block added in cycle s can 
comprise two reactive groups which are the same, but which are both complementary 
to the reactive groups of the building blocks added at steps s-1 and s + 1 . In another ' 
embodiment, each building block comprises two reactive groups which are 

15 themselves complementary. For example, a library comprising polyamide molecules 
can be produced via reactions between building blocks comprising two primary amino 
groups and building blocks comprising two activated carboxyl groups. In the 
' resulting compounds there is no N- or C-terminus, as alternate amide groups have 
opposite directionality. Alternatively, a polyamide library caii be produced using 

20 building blocks that each comprise an amino group and an activated carboxyl group. 
In this embodiment, the building blocks added in step n of the cycle will have a free 
reactive group which is complementary to the available reactive group> on the n-1 
building block, while, preferably, the other reactive group on the nth building block is 
protected. For example, if the members of the library are synthesized from the C to 

25 N direction, the building blocks added will comprise an activated carboxyl group and 
a protected amino group. 

The functional moieties can be polymeric or oligomeric moieties, such as 
peptides, peptidomimetics, peptide nucleic acids or peptoids, or they can be small 
non-polymeric molecules, for example, molecules having a structure comprising a 

30 central scaffold and structures arranged about the periphery of the scaffold. Linear 
polymeric or oligomeric libraries will result from the use of building blocks having 
two reactive groups, while branched polymeric or oligomeric libraries will result from 
the use of building blocks having three or more reactive groups, optionally in 
combination with building blocks having only two reactive groups. Such molecules 
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can be represented by the general formula X1X2. . .Xn, where each X is a monomeric 
nnit of a polymer comprising n monomeric nnits, where n is an integer greater than 1 
In the case of oligomeric or polymeric compounds, the termmal building blocks need 
not comprise two functional groups. For example, in the case of a polyamide library, 
the C-terminal building block can comprise an amino group, but the presence of a 
carboxyl group is optional. Similarly, the building block at the N-terminus can 
comprise a carboxyl group, but need not contain an amino group. 

Branched oligomeric or polymeric compounds can also be synthesized 
provided that at least one building block comprises three functional groups which are 
reactive with other building blocks. A library of the invention can comprise linear 
molecules, branched molecules or a combination thereof 

Libraries can also be constructed using, for example, a scaffold building block 
having two or more reactive groups, in combination with other building blocks having 
only one available reactive group, for example, where any additional reactive groups 
are either protected or not reactive with the other reactive groups present in the 
scaffold building block. In one embodiment, for example, the molecules synthesized 
can be represented by the general formula X(Y)n, where X is a scaffold building 
block; each Y is a building block linked to X and n is an integer of at least two, and 
preferably an integer from 2 to about 6. In one preferred embodiment, the initial 
building block of cycle 1 is a scaffold building block. In molecules of the formula 
X(Y)n, each Y can be the same or different, but in most members of a typical library, 
each Y will be different. 

In one embodiment, the libraries of the invention comprise polyamide 
compounds. The polyamide compoimds can be composed of building blocks derived 
from any amino acids, including the twenty naturally occurring a-amino acids, such 
as alanine (Ala; A), glycine (Gly; G), asparagine (Asn; N), aspartic acid (Asp; D), 
glutamic acid (Glu; E), histidine (His; H), leucine (Leu; L), lysine (Lys; K), 
phenylalanine (Phe; F), tyrosine (Tyr; Y), threonine (Thr; T), serine (Ser; S), arginine 
(Arg; R), valine (Val; V), glutamine (Ghi; Q), isoleucine (He; I), cysteine (Cys; C), 
methionine (Met; M), proline (Pro; P) and tryptophan (Trp; W), where the three-letter 
and one-letter codes for each amino acid are given. In their naturally occurring form, 
each of the foregoing amino acids exists in the L-configuration, which is to be 
assumed herein imless otherwise noted. In the present method, however, the D- 
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configuration forms of these amino acids can also be used. These D-amino acids are 
indicated herein by lower case three- or one-letter code, i.e., ala (a), gly (g), leu (1), 
gin (q), thr (t), ser (s), and so forth. The building blocks can also be derived from 
other a-amino acids, including, but not limited to, 3-arylalanines, such as 

5 naphthylalanine, phenyl-substituted phenylalanines, including 4-fluoro-, 4-chloro, 4- 
bromo and 4-methylphenylalanine; 3-heteroarylalanines, such as 3-pyridylalanine, 3- 
thienylalanine, 3-quinolylalanine, and 3-imidazolylalanine; ornithine; citruUine; 
homocitrulline; sarcosine; homoproline; homocysteine; substituted proline, such as 
hydroxyproline and fluoroproline; dehydroproline; norteucine; O-methyltyrosine; O- 

10 methylserine; O-methylthreonine and 3-cyclohexylalanine. Bach of the preceding 
amino acids can be utilized in either the D- or L-configuration. 

The building blocks can also be amino acids which are not a-amino acids, 
such as a-azaamino acids; P, y, S, E,-amino acids, and N-substituted amino acids, such 
as N-substituted glycine, where the N-substituent can be, for example, a substituted or 

15 unsubstituted alkyl, aryl, heteroaryl, arylalkyl or heteroarylalkyl group. In one 
embodiment, the N-substituent is a side chain from a naturally-occurring or non- 
naturally occurring a-amino acid. 

The building block can also be a peptidomimetic structure, such as a dipeptide, 
tripeptide, tetrapeptide or pentapeptide nndmetic. Such peptidomimetic building 

20 blocks are preferably derived from amino acyl compounds, such that the chemistry of 
addition of these building blocks to the growing poly(aminoacyl) group is llie same 
as, or similar to, the chemistry used for the other building blocks. The buildmg blocks 
can also be molecules which are capable of forming bonds which are isosteric with a 
peptide bond, to form peptidomimetic functional moieties comprising a peptide 

25 backbone modification, such as v|/[CH2S], xj/ [CH2NH], \i/[CSNH2], \|/(NHCO], 

Y[C0CH2], and v^[{E) or (Z) CH=CH]. In the nomenclature used above, \|/ indicates 
the absence of an amide bond. The structure that replaces the amide group is specified 
within the brackets. 

In one embodiment, the invention provides a method of synthesizing a 

30 compound comprising or consisting of a fimctional moiety which is operatively linked 
to an encoding oligonucleotide. The method includes the steps of: (1) providing an 
initiator compound consisting of an initial functional moiety comprising n building 
blocks, where n is an integer of 1 or greater, wherein the initial functional moiety 
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comprises at least one reactive group, and wherein the initial functional moiety is 
operatively linked to an initial oligonucleotide which encodes the n building blocks; 
(2) reacting the initiator compound with a building block comprising at least one 
complementary reactive group, wherein the at least one complementary reactive 

5 group is complementary to the reactive group of step (1), under suitable conditions for 
reaction of the reactive group and the complementary reactive group to form a 
covalent bond; (3) reacting the initial oligonucleotide with an incoming 
oligonucleotide in title presence of an enzyme which catalyzes ligation of the initial 
oligonucleotide and the incoming oligonucleotide, under conditions suitable for 

10 ligation ofthe incoming oligonucleotide and the initial oligonucleotide, thereby 

producing a molecule which comprises or consists of a functional moiety comprising 
n+1 building blocks which is operatively linked to an encoding oligonucleotide. If the 
functional moiety of step (3) comprises a reactive group, steps 1-3 can be repeated 
one or more times, thereby forming cycles 1 to i, where i is an integer of 2 or greater, 

15 with the product of step (3) of a cycle s-1, where s is an integer of i or less, becoming 
the initiator compound of step (1) of cycle s. In each cycle, one building block is 
added to the growing functional moiety and one oligonucleotide sequence, which 
encodes the new building block, is added to the growing encoding oligonucleotide. 

In one embodiment, the initial initiator compoxmd(s) is generated by reacting 

20 a first building block with an oligonucleotide (e.g. , an oUgonucleotide which includes 
PGR primer sequences or an initial oligonucleotide) or with a linker to which such an 
oHgonucleotide is attached. , In the embodiment set forth in Figure 5, the linker 
comprises a reactive group for attachment of a first building block and is attached to 
an initial oligonucleotide. In this embodiment, reaction of a building block, or in each 

25 of multiple aliquots, one of a collection of building blocks, with the reactive group of 
the linker and addition of an oligonucleotide encoding the building block to the initial 
oligonucleotide produces the one or more initial initiator compounds.of the process 
set forth above. 

In a preferred embodiment, each individual building block is associated with a 
30 distinct oligonucleotide, such that the sequence of nucleotides in the oligonucleotide 
added in a given cycle identifies the building block added in the same cycle. 

The coupling of building blocks and Ugation of oligonucleotides will generally 
occur at similar concentrations of starting materials and reagents. For example, 
concentrations of reactants on the order of micromolar to millimolar, for example 
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from about 10 ^Mto about 10 mM, are preferred in order to have efficient coupling 
of building blocks. 

In certain embodiments, the method further comprises, following step (2), the 
step of scavenging any unreacted initial functional moiety. Scavenging any imreacted 
5 initial functional moiety in a particular cycle prevents the initial functional moiety of 
the cycle from reacting with a building block added in a later cycle. Such reactions 
could lead to the generation of functional moieties missing one or more building 
blocks, potentially leading to a range of functional moiety structures which 
correspond to a particular oligonucleotide sequence. Such scavenging can be 

10 accomplished by reacting any remaining initial functional moiety with a compound 
which reacts with the reactive group of step (2). Preferably, the scavenger compound 
reacts rapidly with the reactive group of step (2) and includes no additional reactive 
groups that can react with building blocks added in later cycles. For example, in the 
synthesis of a compound where the reactive group of step (2) is an amino group, a 

15 suitable scavenger compound is an N-hydroxysuccinimide ester, such as acetic acid 
N-hydroxysuccinimide ester. 

In another embodiment, the invention provides a method of producing a 
Ubrary of compounds, wherein each compound comprises a functional moiety 
comprising two or more building block residues which is operatively linked to an 

20 oligonucleotide. In a preferred embodiment, the oligonucleotide present in each 
molecule provides sufficient information to identify the building blocks within the 
molecule and, optionally, the order of addition of the building blocks. In this 
embodiment, the method of the invention comprises a method of synthesizing a 
library of compounds, wherein the compoxmds comprise a functional moiety 

25 comprising two or more building blocks which is operatively linked to an 

oligonucleotide which identifies the structure of the fimctional moiety. The method 
comprises the steps of (1) providing a solution comprising m initiator compounds, 
wherein m is an integer of 1 or greater, where the initiator compounds consist of a 
functional moiety comprising n building blocks, where n is an integer of 1 or greater, 

30 which is operatively linked to an initial oUgonucleotide which identifies the n building 
blocks; (2) dividing the solution of step (1) into at least r fractions, wherein r is an 
integer of 2 or greater; (3) reacting each fraction with one of r building blocks, 
thereby producing r fractions comprising compounds consisting of a functional 
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moiety comprising n+1 building blocks operatively linked to the initial 
oligonucleotide; (4) reacting each of the r fractions of step (3) with one of a set of r 
distinct incoming oligonucleotides under conditions suitable for enzymatic ligation of 
the incoming oUgonucleotide to the initial oligonucleotide, thereby producing r 

5 fractions comprising molecules consisting of a fimctional moiety comprising n+ 1 

building blocks operatively linked to an elongated ohgonucleotide which encodes the 
n+l building blocks. Optionally, the method can ftirther include the step of (5) 
recombining the r fractions, produced in step (4), thereby producing a solution 
comprising molecules consisting of a fimctional moiety comprising n + 1 building 

10 blocks, which is operatively linked to an elongated oligonucleotide which encodes the 
n + 1 building blocks. Steps (1) to (5) can be conducted one or more times to yield 
cycles 1 to i, where i is an integer of 2 or greater. In cycle s+1, where s is an integer 
of i-1 or less, the solution comprising^m initiator compounds of step (1) is IJie solution 
of step (5) of cycle s. Likewise, the initiator compounds of step (1) of cycle s+1 are 

15 the products of step (4) in cycle s. 

Preferably the solution of step (2) is divided into r fractions in each cycle of 
the library synthesis. In this embodiment, each fract is reated with a unique building 
block. 

In the methods of the invention, the order of addition of the building block and 
20 the incoming oligonucleotide is not critical, and steps (2) and (3) of the synthesis of a 
molecule, and steps (3) and (4) in the library synthesis can be reversed, i.e., the 
incoming oligonucleotide can be ligated to Ihe initial oligonucleotide before the new 
building block is added. In certain embodiments, it may be possible to conduct these 
two steps simultaneously. 
25 In certain embodiments, the method ftirther comprises, following step (2), the 

step of scavenging any unreacted initial functional moiety. Scavenging any unreacted 
initial functional moiety in a particular cycle prevents the initial functional moiety of 
a the cycle from reacting with a building block added in a later cycle. Such reactions 
could lead to the generation of functional moieties missing one or more building 
30 blocks, potentially leading to a range of functional moiety structures which 
correspond to a particular oligonucleotide sequence. Such scavenging can be 
accomplished by reacting any remaining initial functional moiety with a compoxmd 
which reacts with the reactive group of step (2). Preferably, the scavenger compound 
reacts rapidly with the reactive group of step (2) and includes no additional reactive 
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groups that can react with building blocks added in later cycles. For example, in the 
synthesis of a compound where the reactive group of step (2) is an amino group, a 
suitable scavenger compound is an N-hydroxysuccinimide ester, such as acetic acid 
N-hydroxysuccinimide ester. 
5 In one embodiment, the building blocks used in the library synthesis are 

selected from a set of candidate building blocks by evaluating the ability of the 
candidate building blocks to react with appropriate complementary functional groups 
under the conditions used for synthesis of the library. Buildmg blocks which are 
shown to be suitably reactive under such conditions can then be selected for 

10 incorporation into the library. The products of a given cycle can, optionally, be 
purified. When the cycle is an intermediate cycle, /.e, any cycle prior to the final 
cycle, these products are intermediates and can be purified prior to initiation of the 
next cycle. If the cycle is the final cycle, the products of the cycle are the final 
products, and can be purified prior to any use of the compoxmds. This purification 

1 5 step can, for example, remove unreacted or excess reactants and the enzyme 

employed for oligonucleotide ligation. Any methods which are suitable for separating 
the products from other species present in solution can be used, including liquid 
chromatography, such as high performance liquid chromatography (HPLC) and 
precipitation with a suitable co-solvent, such as ethanol. Suitable methods for 

20 purification will depend upon the nature of the products and the solvent system used 
for sj^thesis. , 

The reactions are, preferably, conducted in aqueous solution, such as a 
buffered aqueous solution, but can also be conducted in mixed aqueous/organic media 
consistent with the solubility properties of the building blocks, the oligonucleotides, 

25 the intermediates and final products and the enzyme used to catalyze the 
oligonucleotide ligation. 

It is to be understood that the theoretical number of compounds produced by a 
given cycle in the method described above is the product of the number of different 
initiator compounds, m, used in the cycle and the number of distinct building blocks 

30 added in the cycle, r. The actual number of distinct compounds produced in the cycle 
can be as high as the product of r and m (r x m), but could be lower, given differences 
in reactivity of certain building blocks with certain other building blocks. For 
example, the kinetics of addition of a particular building block to a particular initiator 
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compound may be such that on the time scale of the synthetic cycle, little to none of 
the product of that reaction may be produced. 

In certain embodiments, a common building block is added prior to cycle 1, 
following the last cycle or in between any two cycles. For example, when the 
functional moiety is a polyamide, a common N-temiinal capping building block can 
be added after the final cycle. A common building block can also be introduced 
between any two cycles, for example, to add a functional group, such as an alkyne or 
azide group, which can be utilized to modify the functional moieties, for example by 
cycUzation, following library synthesis. 

The term "operatively linked", as used herein, means that two chemical 
structures are linked together in such a way as to remain linked through the various 
manipulations they are expected to undergo. Typically the functional moiety and Ihe 
encoding oligonucleotide are linked covalently via an appropriate linking group. The 
. linking group is a bivalent moiety with a site of attachment for the oligonucleotide 
and a site of attachment for the fimctional moiety. For example, when the functional 
moiety is a polyamide compound, the polyamide compoimd can be attached to the 
linking group at its N-terminus, its C-terminus or via a functional group on one of the 
side chains. The linking group is sufficient to separate the polyamide compound and 
the oUgonucleotide by at least one atom, and preferably, by more than one atom, such 
as at least two, at least three, at least four, at least five or at least six atoms. 
Preferably, the linking group is sufficiently flexible to allow fihie polyamide compound 
to bind target molecules in a manner which is independent of the oUgonucleotide. 

In one embodiment, the linking group is attached to the N-terminus of the 
polyamide compound and the 5'-phosphate group of the oligonucleotide. For 
example, the linking group can be derived from a linking group precursor comprising 
an activated carboxyl group on one end and an activated ester on the other end. 
Reaction of the linking group precursor with the N-terminal nitrogen atom will form 
an amide bond connecting the linking group to the polyamide compoimd or N- 
terminal building block, while reaction of the linking group precursor with the 5'- 
hydroxy group of the ohgonucleotide will result in attachment of the oUgonucleotide 
to the linking group via an ester linkage. The linking group can comprise, for 
example, a polymethylene chain, such as a -(CH2)n- chain or a poly(ethylene glycol) 
chain, such as a -(CH2CH20)n chain, where in both cases n is an integer from 1 to 
about 20. Preferably, n is from 2 to about 12, more preferably from about 4 to about 
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10. In one embodiment, the linking group comprises a hexamethylene (-(CH2)6-) 
group. 

When the building blocks are amino acid residues, the resulting functional 
moiety is a polyamide. The amino acids can be coupled using any suitable chemistry 
5 for the formation of amide bonds. Preferably, the coupling of the amino acid building 
blocks is conducted imder conditions which are compatible with enzymatic ligation of 
oligonucleotides, for example, at neutral or near-neutral pH and in aqueous solution. 
In one embodiment, the polyamide compound is synthesized from the C-terminal to 
N-terminal direction. In this embodiment, the first, or C-terminal, building block is 

10 coupled at its carboxyl group to an oligonucleotide via a sviitable Unking group. The 
first building block is reacted with the second building block, which preferably has an 
activated carboxyl group and a protected amino group. Any activating/protecting 
group strategy which is suitable for solution phase amide bond formation can be used. 
For example, suitable activated carboxyl species include acyl fluorides (U.S. Patent 

15 No. 5,360,928, incorporated herein by reference in its entirety), symmetrical 

anhydrides and N-hydroxysuccinimide esters. The acyl groups can also be activated 
in situ, as is known in the art, by reaction with a suitable activating compound. 
Suitable activating compounds include dicyclohexylcarbodiimide (DCC), 
diisopropylcarbodiimide (DIC), l-ethoxycarbonyl-2-ethoxy-l,2-dihydroquinoline 

20 (EEDQ), l-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDC), n- 
propane-phosphonic anhydride (PPA), N,N-bis (2-oxo-3-oxazolidinyl)imido- 
phosphoryl chloride (BOP-Cl), bromo-tris-pyrroUdinophosphonium 
hexafluorophosphate (PyBrop), diphenylphosphoryl azide (DPPA), Castro's reagent 
03OP, PyBop), 0-benzotria2olyl-N,N,N', N'-tetramethyluronium salts (HBTU), 

25 diethylphosphoryl cyanide (DEPCN), 2,5-diphenyl-2,3-dihydro-3-oxo-4-hydroxy- 

thiophene dioxide (StegUch's reagent; HOTDO), l,r-carbonyl-diimidazole (CDI), and 
4-(4^6-dimethoxy-l,3,5-triazin--2-yl)-4-methylmorpholiniiim chloride (DMT-MM). 
The coupling reagents can be employed alone or in combination with additives such 
as N. N-dimethyl-4-aminopyridine (DMAP), N-hydroxy-benzotriazole (HOBt), N- 

30 hydroxybenzotriazine (HOOBt), N-hydroxysuccinimide (HOSu) N- 

hydroxyazabenzotriazole (HOAt), azabenzotriazolyl-tetramethyluronium salts 
(HATU, HAPyU) or 2-hydroxypyridine. In certain embodiments, synthesis of a 
library requires the use of two or more activation strategies, to enable the use of a 
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structurally diverse set of building blocks. For each building block, one skilled in the 
art can determine the appropriate activation strategy. 

The N-terminal protecting group can be any protecting group which is 
compatible with the conditions of the process, for example, protecting groups which 
5 ' are suitable for solution phase synthesis conditions. A preferred protecting group is 
the fluorenylmethoxycarbonyl ("Fmoc") group. Any potentially reactive functional 
groups on the side chain of the aminoacyl building block may also need to be suitably 
protected. Preferably the side chain protecting group is orthogonal to the N-terminal 
protecting group, that is, the side chain protecting group is removed imder conditions 

10 which are different than those required for removal of the N-terminal protecting 

group. Suitable side chain protecting groups include the nitroveratryl group, which 
can be used to protect both side chain carboxyl groups and side chain amino groups. 
Another suitable side chain amine protecting group is the N--pent--4-enoyl group. 
The building blocks can be modified following incorporation into the 

15 functional moiety, for example, by a suitable reaction involving a functional group on 
one or more of the building blocks. Building block modification can take place 
following addition of the final building block or at any intermediate pomt in the 
synthesis of the functional moiety, for example, after any cycle of the synthetic 
process. When a library of bifimctional molecules of the invention is synthesized, 

20 building block modification can be carried out on the entire library or on a portion of 
the library, thereby increasing the degree of complexity of the library. Suitable 
building block modifying reactions include those reactions that can be performed 
under conditions compatible with the functional moiety and the encoding, 
oligonucleotide. Examples of such reactions include acylation and sulfonation of 

25 amino groups or hydroxyl groups, alkylation of amino groups, esteiification or 

thioesterification of carboxyl groups, amidation of carboxyl groups, epoxidation of 
alkenes, and other reactions as are known the art. When the functional moiety 
includes a building block having an alkyne or an azide functional group, the 
azide/alkyne cycloaddition reaction can be used to derivatize the building block. For 

30 example, a building block including an alkyne can be reacted with an organic azide, 
or a building block including an azide can be reacted with an alkyne, in either case 
forming a triazole. Building block modification reactions can take place after 
addition of the final building block or at an intermediate point in the synthetic 
process, and can be used to append a variety of chemical stmctures to the functional 
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moiety, including carbohydrates, metal binding moieties and structures for targeting 
certain biomolecules or tissue types. 

In another embodiment, the functional moiety comprises a linear series of 
building blocks and this linear series is cyclized using a suitable reaction. For 
5 example, if at least two building blocks in the linear array include sulfhydryl groups, 
the sulfhydryl groups can be oxidized to form a disulfide linkage, thereby cyclizing 
the linear array. For example, the functional moieties can be oligopeptides which 
include two or more L or D-cysteine and/or L or D-homocysteine moieties. The 
building blocks can also include other functional groups capable of reacting together 

10 to cyclize the linear array, such as carboxyl groups and amino or hydroxyl groups. 

In a preferred embodiment, one of the building blocks in the linear array 
comprises an alkyne group and another building block in the linear array comprises an 
, azide group. The azide and alkyne groups can be induced to react via cycloaddition, 
resulting in the formation of a macrocyclic stracture. In the.example illustrated in 

15 Figure 9, the functional moiety is a polypeptide comprismg a propargylglycine 

building block at its C-terminus and an azidoacetyl group at its N-terminus. Reaction 
of the alkyne and the azide group under suitable conditions results in formation of a 
cyclic compound, which includes a triazole structure within the macrocycle. In the 
case of a library, in one embodiment, each member of the library comprises alkyne- 

20 and azide-containing building blocks and can be cyclized in this way. In a second 
embodiment, all members of the library comprises alkyne- and azide-containing 
building blocks, but only a portion of the library is cycUzed. In a third embodiment, 
only certain functional moieties include alkyne- and azide-containing building blocks, 
and only these molecules are cyclized. hi the forgoing second and third 

25 embodiments, the library, following fhe cycloaddition reaction, will include both 
cyclic and linear functional moieties. 

In some embodiments of the invention in which the same functional moiety, 
e.g., triazine, is added to each and all of the firactions of the library during a particular 
synthesis step, it may not be necessary to add an oligonucleotide tag encoding that 

30 function moiety. 

Oligonucleotides may be ligated by chemical or enzymatic methods. In one 
embodiment, oligonucleotides are ligated by chemical means. Chemical ligation of 
DNA and RNA may be performed using reagents such as water soluble carbodiimide 
and cyanogen bromide as taught by, for example, Shabarova, et al (1991) Nucleic 
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Acids Research, 19, 4247-4251), Federova, et at (1996) Nucleosides and Nucleotides, 
15, 1 137-1 147, and Carriero and Damha (2003) Journal of Organic Chemistry, 68, 
8328-8338. In one embodiment, chemical ligation is performed using cyanogen 
bromide, 5 M in acetonitrile, in a 1:10 v/v ratio with 5' phosphorylated 
oligonucleotide in a pH 7,6 buffer (1 M MES + 20 mM MgCl2) at 0 degrees for 1 - 5 
minutes. The oligonucleotides may be double stranded, preferably with an overhang 
of about 5 to about 14 bases. The oligonucleotide may also be single stranded, in 
which case a splint with an overlap of about 6 bases with each of the ohgonucleotides 
to be ligated is employed to position the reactive 5' and 3' moieties in proximity with 
each other. 

Ih another embodiment, the ohgonucleotides are ligated using enzymatic 
methods, hi one embodiment, the initial building block is operatively linked to an 
initial oligonucleotide. Prior to or following coupling of a second building block to 
the initial building block, a second oUgonucleotide sequence which identifies the 
second building block is Ugated to the initial oligonucleotide. Methods for ligating 
the initial oligonucleotide sequence and the incoming oHgonucleotide sequence are set 
forth in Figures 1 and 2. hi Figure 1, the initial ohgonucleotide is double-stranded, 
and one strand includes an overhang sequence which is complementary to one end of 
the second oligonucleotide and brings the second ohgonucleotide into contact with the 
initial oUgonucleotide. Preferably the overhanging sequence of the mitial 
oUgonucleotide and the complementary sequence of the second oUgonucleotide are 
both at least about 4 bases; more preferably both sequences are both the same length. 
The initial oligonucleotide and the second oligonucleotide can be Ugated using a 
suitable enzyme. If the initial oUgonucleotide is linked to the first building block at 
the 5' end of one of the strands (the '*top strand"), then the strand which is 
complementary to the top strand (the "bottom strand") will mclude the overhang 
sequence at its 5' end, and the second oligonucleotide will include a complementary 
sequence at its 5 'end. Following ligation of the second oligonucleotide, a strand can 
be added which is complementary to the sequence of the second oUgonucleotide 
which is 3' to the overhang complementary sequence, and which includes additional 
overhang sequence. 

In one embodiment, the oUgonucleotide is elongated as set forth in Figure 2. 
The oUgonucleotide bound to the growing functional moiety and the incoming 
oligonucleotide are positioned for Ugation by the use of a "spUnt" sequence, which 
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includes a region which is complementary to the 3 ' end of the initial oligonucleotide 
and a region which is complementary to the 5' end of the incoming oligonucleotide. 
The splint brings the 5' end of the oKgonucleotide into proximity with the 3' end of 
the incoming oligo and ligation is accomplished using enzymatic ligation. In the 
5 example illustrated in Figure 2, the initial oligonucleotide consists of 1 6 nucleobases 
and the splint is complementary to the 6 bases at the 3' end. The incoming 
oligonucleotide consists of 12 nucleobases, and the splint is complementary to the 6 
bases at the 5' terminus. The length of the splint and the lengths of the 
complementary regions are not critical. However, the complementary regions should 

10 be sufficiently long to enable stable dimer formation under the conditions of the 
ligation, but not so long as to yield an excessively large encoding nucleotide in the 
final molecules. It is preferred that the complementary regions are from about 4 bases 
to about 12 bases, more preferably from about 5 bases to about 10 bases, and most 
preferably from about 5 bases to about 8 bases in length. 

15 The split-and-pool methods used for the methods for library synthesis set forth 

herein assure that each unique fimctional moiety is operatively linked to at least one 
imique oligonucleotide sequence which identifies the fimctional moiety. If 2 or more 
different oligonucleotide tags are used for at least one building bock in at least one of 
the synthetic cycles, each distinct fimctional moiety comprising that building block 

20 will be encoded by rnultiple oligonucleotides. For example, if 2 oligonucleotide tags 
are used for each building block during the synthesis of a 4 cycle library, there will be 
16 DNA sequences (2"^) that encode each unique fimctional moiety. There are several 
potential advantages for encoding each unique fimctional moiety with multiple 
sequences. First, selection of a different combination of tag sequences encoding the 

25 same fimctional moiety assures that those molecules were independently selected. 
Second, selection of a different combination of tag sequences encoding the same 
functional moiety eliminates the possibiUty that the selection was based on the 
sequence of the oligonucleotide. Third, technical artifact can be recognized if 
sequence analysis suggests that a particular fimctional moiety is highly enriched, but 

30 only one sequence combination out of many possibilities appears. Multiple tagging 
can be accomplished by having independent split reactions with the same building 
block but a different oligonucleotide tag. Alternatively, multiple tagging can be 
accomplished by mixing an appropriate ratio of each tag in a single tagging reaction 
with an individual building block. 
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In one embodiment, the initial oligonucleotide is double-stranded and the two 
strands are covalently joined. One means of covalently joining the two strands is 
shown in Figure 3, in which a linking moiety is used to link the two strands and the 
functional moiety. The linking moiety can be any chemical structure which 
5 comprises a first functional group which is adapted to react with a building block, a 
second functional group which is adapted to react with the 3 '-end of an 
. oUgonucleotide, and a third fimctional group which is adapted to react with the 5 '-end 
of an oligonucleotide. Preferably, the second and third functional groups are oriented 
so as to position the two oligonucleotide strands in a relative orientation that permits 
10 hybridization of the two strands. For example, the linking moiety can have the 
general stmcture (I): 



15 where A, ia a functional group that can form a covalent bond with a building block, B 
is a functional group that can form a bond with the 5'-end of an oligonucleotide, and 
C is a functional group that can form a bond with the 3 '-end of an oligonucleotide. D, 
F and E are chemical groups that link functional groups A, C and B toS, which is a 
core atom or scaffold. Preferably, D, E and F are each independently a chain of 

20 atoms, such as an alkylene chain or an oligo(ethylene glycol) chain, and D, E and F 

can be the same or different, and are preferably effective to allow hybridization of the 
two oligonucleotides and synthesis of the functional moiety. In one embodiment, the 
trivalent linker has the structure 




A 



B 



(I) 
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In this embodiment, the NH group i$ available for attachment to a building block, 
while the terminal phosphate groups are available for attachment to an 
5 oligonucleotide. 

In embodiments in which the initial oligonucleotide is double-stranded, the 
incoming oligonucleotides are also double-stranded. As shown in Figure 3, the initial 
oligonucleotide can have one strand which is longer than the other, providing an 
overhang sequence. In this embodiment, the incoming oligonucleotide includes an 

10 overhang sequence which is complementary to the overhang sequence of the initial 
oUgonucleotide. Hybridization of the two complementary overhang sequences brings 
the incoming oligonucleotide into position for ligation to the initial oligonucleotide. 
This ligation can be performed enzymatically using, a DNA or RNA ligase. The 
overhang sequences of the incoming olijgonucleotide and the initial oligonucleotide 

15 are preferably the same length and consist of two or more nucleotides, preferably 

from 2 to about 10 nucleotides, more preferably from 2 to about 6 nucleotides. In one 
preferred embodiment, the incoming oligonucleotide is a double-stranded 
oligonucleotide having an overhang sequence at each end. The overhang sequence at 
one end is complementary to the overhang sequence of the initial oligonucleotide, 

20 while, after ligation of the incoming oligonucleotide and the initial oligonucleotide, 
the overhang sequence at the other end becomes the overhang sequence of initial 
oligonucleotide of the next cycle. In one embodiment, the three overhang sequences 
are all 2 to 6 nucleotides in length, and the encoding sequence of the incoming 
oligonucleotide is from 3 to 10 nucleotides in length, preferably 3 to 6 nucleotides in 

25 length* In a particular embodiment, the overhang sequences are all 2 nucleotides in 
length and the encoding sequence is 5 nucleotides in length. 
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In the embodiment illustrated in Figure 4, the incoming strand has a region at 
its 3' end which is complementary to the 3' end of the initial oligonucleotide, leaving 
overhangs at the 5' ends of both strands. The 5' ends can be filled in using, for 
example, a DNA polymerase, such as vent polymerase, resulting in a double-stranded 
elongated oUgonucleotide. The bottom strand of this oHgonucleotide can be removed, 
and additional sequence added to the 3' end of the top strand using the same method. 

The encoding oligonucleotide tag is formed as the result of the successive 
addition of oligonucleotides that identify each successive building block. In one 
embodiment of the methods of the invention, the successive oUgonucleotide tags may 
be coupled by enzymatic Ugation to produce an encoding oligonucleotide. 

Enzyme-catalyzed Ugation of oligonucleotides can be performed using any 
enzyme that has the ability to Ugate nucleic acid fi-agments. Exemplary enzymes 
include ligases, polymerases, and topoisomerases. In specific embodiments of the 
invention, DNA ligase (EC 6.5.1.1), DNA polymerase (EC 2.7.7.7), RNA polymerase 
(EC 2.7.7.6) or topoisomerase (EC 5.99. 1 .2) are used to ligate the oligonucleotides. 
Enzymes contained in each EC class can be found, for example, as described in 
Bairoch (2000) Nucleic Acids Research 28:304-5. 

In a preferred embodiment, the oligonucleotides used in the methods of the 
invention are oUgodeoxynucleotides and the enzyme used to catalyze the 
oligonucleotide ligation is DNA Hgase. In order for Ugation to occur in tite presence 
of the ligase, L e., for a phosphodiester bond to be formed between two 
oUgonucleotides, one oUgonucleotide must have a free 5' phosphate group and the 
other oUgonucleotide must have a firee 3' hydroxyl group. Exemplary DNA Ugases 
that may be used in the methods of the invention include T4 DNA Ugase, Taq DNA 
Ugase, T4 RNA Ugase, DNA Ugase {E. coli) (all available fi-om, for example. New 
England Biolabs, MA). 

One of skill in the art will understand that each enzyme used for ligation has 
optimal activity under specific conditions, e.g,, temperature, buffer concentration, pH 
and time. Each of these conditions can be adjusted, for example, according to the 
manufacturer's instructions, to obtain optimal Ugation of the oUgonucleotide tags. 

The incoming oUgonucleotide can be of any desirable length, but is preferably 
at least three nucleobases in length. More preferably, the incoming oUgonucleotide is 
4 or more nucleobases in length. In one embodiment, the incoming oUgonucleotide is 
firom 3 to about 12 nucleobases in length. It is preferred that the oUgonucleotides of 
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the molecules in the libraries of the invention have a common temiinal sequence 
which can serve as a primer for PGR, as is known in the art. Such a common terminal 
sequence can be incorporated as the terminal end of the incoming oligonucleotide 
added in the final cycle of the library synthesis. Or it can be added following library 
5 synthesis, for example, using the enzymatic ligation methods disclosed herein. 

A preferred embodiment of the method of the invention is set forth in Figure 
5, The process begins with a synthesized DNA sequence which is attached at its 5' 
end to a linker which terminates in an amino group. In step 1, this starting DNA 
sequence is ligated to an incoming DNA sequence in the presence of a splint DNA 

10 strand, DNA ligase and dithiothreitol in Tris buffer. This yields a tagged DNA 
sequence which can then be used directly in the next step or purified, for example, 
using HPLC or efhanol precipitation, before proceeding to the next stqp. In step 2 the 
tagged DNA is reacted with a protected activated amino acid, in this example, an 
Fmoc-protected amino acid fluoride, yielding a protected amino acid-DNA conjugate. 

15 In step 3, the protected amino acid-DNA conjugate is deprotected, for example, in the 
presence of piperidine, and the resulting deprotected conjugate is, optionally, purified, 
for example, by HPLC or ethanol precipitation. The deprotected conjugate is the 
product of the first synthesis cycle, and becomes the starting material for the second 
cycle, which adds a second amino acid residue to the free amino group of the 

20 deprotected conjugate. 

In embodiments in which PGR is to be used to amplify and/or sequence the 
encoding oligonucleotides of selected molecules, the encoding oligonucleotides may 
include, for example, PGR primer sequences and/or sequencing primers (e.g., primers 
such as, for example, 3'-GAGTACGGCGGTCGCTCCG-5' and 3'- 

25 GAGTGGGCCGACCGTTGGG-5'). A PGR primer sequence can be included, for 

example, in the initial oligonucleotide prior to the first cycle of synthesis, and/or it can 
be included with the first incoming oligonucleotide, and/or it can be Ugated to the 
encoding oligonucleotide following the final cycle of library synthesis, and/or it can 
be included in the incoming oligonucleotide of the final cycle. The PGR primer 

30 sequences added following the final cycle of library synthesis and/or in the incoming 
oligonucleotide of the final cycle are referred to herein as "capping sequences". 

In one embodiment, the PGR primer sequence is designed into the encoding 
oligonucleotide tag. For example, a PGR primer sequence may be incorporated into 
the initial oUgonucleotide tag and/or it may be incorporated into the final 
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oligonucleotide tag. Jn one embodiment the same PGR primer sequence is 
incorporated into the initial and final oligonucleotide tag. In another embodiment, a 
first PGR sequence is incorporated into the initial oligonucleotide tag and a second 
PGR primer sequence is incorporated in the final oligonucleotide tag. Alternatively, 
5 the second PGR primer sequence may be incorporated into the capping sequence as 
described herein, hi preferred embodiments, the PGR primer sequence is at least 
about 5, 7, 10, 13, 15, 17, 20, 22, or 25 nucleotides in length. 

PGR primer sequences suitable for use in the libraries of the invention are 
known in the art; suitable primers and methods are set forth, for example, in Innis, et 

10 a/., eds., PCR Protocols: A Guide to Methods and Applications^ San Diego: Academic 
Press (1990), the contents of which are incorporated herein by reference in their 
entirety. Other suitable primers for use in the construction of the libraries described 
herein are those primers described in PGT Publications WO 2004/069849 and WO 
2005/003375, the contents of which are expressly incorporated herehi by reference. 

15 The term "polynucleotide" as used herein in reference to primers, probes and 

nucleic acid fi:agments or segments to be synthesized by primer extension is defined 
as a molecule comprised of two or more deoxyribonucleotides, preferably more than 
three. 

The term "primer" as used herein refers to a polynucleotide whether purified 
20 firom a nucleic acid restriction digest or produced synthetically, which is capable of 
acting as a point of initiation of nucleic acid synthesis when placed under conditions 
in which synthesis of a primer extension product which is complementary to a nucleic 
acid strand is induced, f.e., in the presence of nucleotides and an agent for 
polymerization such as DNA polymerase, reverse transcriptase and the like, and at a 
25 suitable temperature and pH. The primer is preferably single stranded for maximum 
efficiency, but may alternatively be in double stranded form. If double stranded, the 
primer is first treated to separate it from its complementary strand before being used 
to prepare extension products. Preferably, the primer is a polydeoxyribonucleotide. 
The primer must be sufficiently long to prime the synthesis of extension products in 
30 the presence of the agents for polymerization. The exact lengths of the primers will 
depend on many factors, including temperature and the source of primer. 

The primers used herein are selected to be "substantially" complementary to 
the different strands of each specific sequence to be amplified. This means that the 
primer must be sufficiently complementary so as to non-randomly hybridize with its 
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respective template strand. Therefore, the primer sequence may or may not reflect the 
exact sequence of the template. 

The polynucleotide primers can be prepared using any suitable method, such 
as, for example, the phosphotriester or phosphodiester methods described in Nararig et 
5 al, (1979) Meth, Enzymol, 68:90; U.S. Pat No. 4,356,270, U.S. Pat. No. 4,458,066, 
U.S. Pat. No. 4,416,988, U.S. Pat. No. 4,293,652; and Brown et aU (1979) Meth. 
EnzymoL, 68:109. The contents of all the foregoing documents are incorporated 
herein by reference. 

In cases in which the PGR primer sequences are included in an incoming 

10 oligonucleotide, these inconiiiiig oligonucleotides will preferably be significantly 

longer than the incoming oUgonucleotides added in the other cycles, because they will 
include both an encoding sequence and a PGR primer sequence. 

In one embodiment,^ the capping sequence is added after the addition of the 
final bmlding block and final incoming oligonucleotide, and the synthesis of a library 

15 as set forth herein includes the step of ligating the capping sequence to the encoding 
oligonucleotide, such that the oligonucleotide portion of substantially all of the library 
members terminates in a sequence that includes a PGR primer sequence. Preferably, 
the capping sequence is added by ligation to the pooled fractions which are products 
of the final synthetic cycle. The capping sequence can be added using the enzymatic 

20 process used in the construction of the Ubrary. 

In one embodiment, the same capping sequence is hgated to every member of 
the library. In another embodiment, a plurality of capping sequences are used. In this 
embodiment, oligonucleotide capping sequences containing variable bases are, for 
example, ligated onto library members following the final synthetic cycle. In one 

25 embodiment, following the final synthetic cycle, the firactions are pooled and then 
split into firactions again, with each firaction having a different capping sequence 
added. Alternatively, multiple capping sequences can be added to Ihe pooled library 
following the final synthesis cycle. In both embodiments, the final library members 
will include molecules comprising specific functional moieties linked to identifying 

30 oligonucleotides including two or more different capping sequences. 

In one embodiment, the capping primer comprises an oligonucleotide 
sequence containing variable, i.e., degenerate, nucleotides. Such degenerate bases 
within the capping primers permit the identification of library molecules of interest by 
determining whether a combination of building blocks is the consequence of PGR 
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duplication (identical sequence) or independent occurrences of the molecule (different 
sequence). For example, such degenerate bases may reduce the potential number of 
false positives identified during the biological screening of the encoded library. 

In one embodiment, a degenerate capping primer comprises or has the 
following sequence: 

5'-CAGCGTTCGA-3' 

3'-AA GTCGCAAGCT NNNNN GTCTGJTCGAAGTGGACG-5' 
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where N can be any of the 4 bases, permitting 1024 different sequences (4 ). 
The primer has the following sequence after its ligation onto the library and primer- 
extension: 

5'-CAGCGTTCGAN'N'N'N'N'CAGACAAGCTTCACCTGC.3' 
3 '-AA GTCGCAAGCT NNNNN GTCTGTTCGAAGTGGACG-5 ' 



20 



In another embodiment, the capping primer comprises or has the following 
sequence: 

3'-AA GTCGCAAGCTACGABBBABBBABBBAGACTACCGCGCTCCCTCCG 
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where B can be any of C, G or T, permitting 19,683 different sequences (3^). 
The design of the degenerate region in this primer improves DNA sequence analysis. 



- 36- 



wo 2007/053358 PCT/US2006/041356 

as the A bases that flank and punctuate the degenerate B bases prevent 
homopolymeric stretches of greater tiian 3 bases, and facilitate sequence alignment. 

In one embodiment, the degenerate capping oligonucleotide is ligated to the 
members of the library using a suitable enzyme and the upper strand of the degenerate 
5 capping oligonucleotide is subsequently polymerized using a suitable enzyme, such as 
a DNA polymerase. 

In another embodiment, the PGR priming sequence is a "universal adaptor" or 
'Hmiversal primer". As used herein, a "universal adaptor" or "tmiversal primer" is an 
oligonucleotide that contains a unique PGR priming re^on, that is, for example, about 
10 5, 7, 10, 13, 15, 17, 20, 22, or 25 nucleotides in length, and is located adjacent to a 
imique sequencing priming region that is, for example, about 5, 7, 10, 13, 15, 17, 20, 
22, or 25 nucleotides in length, and is optionally followed by a unique discriminating 
key sequence (or sample identifier sequence) consisting of at least one of each of the 
four deoxyribonucleotides (i.e.. A, C, G, T). 
15 As used herein, the term "discriminating key sequence' or "sample identifier 

sequence" refers to a sequence that may be used to uniquely tag a population of 
molecules fi"om a sample. Multiple samples, each contaning a unique sample 
identifier sequence , can be mixed, sequenced and re-sorted after DNA sequencing for 
analysis of individual samples. The same discriminating sequence can be used for an 
20 entire library or, alternatively, different discriminating key sequences can be used to 
track different libraries. In one embodiment, the discriminating key sequence is on 
either the 5 ' PGR primer, the 3 ' PGR primer, or on both primers. If both PGR primers 
contain a sample identifier sequence, the number of different samples that can be 
pooled with unique sample identifier sequences is the product of the number of 
25 sample identifier sequences on each primer. Thus, 10 different 5' sample identifier 
sequence primers can be combined with 10 different 3' sample identifier sequence 
primers to yield 100 different sample identifier sequence combinations. 

Non-limiting examples of 5' and 3' unique PGR primers containing 
discriminating key sequences include the following: 
30 5' primers (variable positions bold and italicized): 

5' A - GCCTTGGGAGGGGGCTCAG^TGACTCGGAAATGGATGTG; 

5^ G - GGGTTGGGAGGGCGCTGAGCTGAGTGCGAAATGGATGTG; 

5' G - GCCTTGCCAGCCCGCTCAGGTGAGTCGGAAATGGATGTG; 

5' T- GGCTTGGGAGGGGGCTGAG2TGAGTCCCAAATCGATGTG; 
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5' AA- GCCTTGCCAGCCCGCTCAG^TGACTCCCAAATCGATGTG; 
5' AC - GCCTTGCCAGCCCGCTCAG.4CTGACTCCCAAATCGATGTG; 
5' AG - GCCTTGCCAGCCCGCTCAG^GTGACTCCCAAATCGATGTG; 
5' AT - GCCTTGCCAGCCCGCTCAG^ITGACTCCCAAATCGATGTG; 

5 and 

5' CA - GCCTTGCCAGCCCGCTCAGa4TGACTCCCAAATCGATGTG. 

3' SID primers (variable positions bold and italicized): 

3' A - GCCTCCCTCGCGCCATCAG>4GCAGGTGAAGCTTGTCTG; 
10 3' C - GCCTCCCTCGCGCCATCAGCGCAGGTGAAGCTTGTCTG; 

3* G- GCCTCCCTCGCGCCATCAGGGCAGGTGAAGCTTGTCTG; 

3* T - GCCTCCCTCGCGCCATCAGrGCAGGTGAAGCTTGTCTG; 

3' AA - GCCTCCCTCGCGCCATCAG^GCAGGTGAAGCTTGTCTG; 

3^ AC - GCCTCCCTCGCGCCATCAG^CGCAGGTGAAGCTTGTCTG; 
15 3' AG-GCCTCCCTCGCGCCATCAG4GGCAGGTGAAGCTTGTCTG; 

3' AT - GCCTCCCTCGCGCCATCAG^rGCAGGTGAAGCTTGTCTG; 

and 

3' CA - GCCTCCCTCGCGCCATCAGC4GCAGGTGAAGCTTGTCTG 

In one embodiment, the discriminating key sequence is about 4, 5, 6, 7, 8, 9, or 

20 10 nucleotides in length. In another embodiment, the discriminating key sequence is a 
combination of about 1-4 nucleotides. In yet another embodiment, each universal 
adaptor is about forty-four nucleotides in length. In one embodiment the universal 
adaptors are ligated, using T4 DNA ligase, onto the end of the encoding 
oligonucleotide. Different universal adaptors may be designed specifically for each 

25 library preparation and will, therefore, provide a unique identifier for each library. 
The size and sequence of the universal adaptors may be modified as deemed 
necessary by one of skill in the art. 

In one embodiment, the universal adaptor added as a capping sequence is 
linked to a support binding moiety. For example, a 5'-biotin is added to the universal 

30 adaptor to allow, for example, isolation of single-stranded DNA template as well as 
non-covalent coupling of the xiniversal adaptor to the surface of a solid support that is 
saturated with a biotin-binding protein (z.e,, streptavidin, neutravidin or avidin). Other 
linkages are well known in the art and may be used in place of biotin-streptavidin (for 
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example antibody/antigen-epitope, receptor/ligand and oligonucleotide pairing or 
complimentarity) . 

In another embodiment, the capping sequence contains anchor primer 
sequences such that the members of the library may be attached to a solid substrate, 
5 In one embodiment, the anchor primer sequences are annealed to the capping 

sequences using recognized techniques (see, e.g.. Hatch, et aL (1999) Genet. Anal 
Biomol Engineer 15: 35-40; U.S. Patent No. 5,714,320, and U.S. Patent No. 
5,854,033). In general, any procedure for annealing the anchor primers to the capping 
sequences is suitable as long as it results in formation of specific, i.e., perfect or 
1 0 nearly perfect, complementarity between the adapter region or regions in the anchor 
primer sequence and a sequence present in the capping sequences. The anchoring of 
the encoding oligonucleotide to the solid surface may be reversible or irreversible, 
e.gy the anchor to the solid surface may be cleavable or non-cleavable. 

In one embodiment, the universal primer, is annealed to a solid support that 
15 contains oligonucleotide capture primers that are complementary to the PGR priming 
regions of the universal adaptor ends. 

Iq one embodiment, the soKd support is a bead, for example, a sepharose bead. 
The beads may be of any convenient size and fabricated from any number of known 
materials. Example of such materials include: uiorganics, natural polymers, and 
20 synthetic polymers. Specific examples of these materials include: cellulose, cellulose 
derivatives, acryUc resins, glass; silica gels, polystyrene, gelatin, polyvinyl 
pyrrolidone, co-polymers of vinyl and acrylamide, polystyrene cross-linked with 
divinylbenzene or the like (see, Merrifield (1964) Biochemistry 3 : 1385-1390), 
polyacrylamides, latex gels, polystyrene, dextran, rubber, silicon, plastics, 
25 nitrocellulose, celluloses, natural sponges, silica gels, glass, metals plastic, cellulose, 
cross-linked dextrans (e.g., Sephadex™) and agarose gel (Sepharose™) and solid 
phase supports known to those of skill in the art. 

The encoding oligonucleotides may be attached to the solid support capture 
bead ("DNA capture bead") in any manner known in the art. Any suitable coupling 
30 agent known in the art can be used, such as, for example, water-soluble carbodiimide, 
to link the 5'-phosphate on the DNA to amine-coated capture beads through a 
phosphoamidate bond, coupling specific oligonucleotide linkers to the bead using 
similar chemistry, and using DNA ligase to link the DNA to the linker on the bead, 
joining the oligonucleotide to the beads using N-hydroxysuccinamide (NHS) and its 
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derivatives, such that one end of the oligonucleotide may contain a reactive group 
(such as an amide group) which forms a covalent bond with the solid support, while 
the other end of the linker contains a second reactive group that can bond with the 
oligonucleotide to be immobilized. 
5 In another embodiment, the oligonucleotide is bound to the DNA capture bead 

by non-covalent linkage, such as chelation or antigen-antibody complexes, may also 
be used to join the oligonucleotide to the bead. Oligonucleotide linkers can be 
employed which specifically hybridize to unique sequences at the end of the DNA 
firagment, such as the overlapping end firom a restriction enzyme site or the "sticky 
10 ends" of bacteriophage lambda based cloning vectors, but blunt-end ligations can also 
be used beneficially. These methods are described in detail in U.S. Patent No. 
5,674,743. It is preferred that any method used to immobilize the beads will continue 
to bind the immobilized oligonucleotide throughout the steps in the methods of the 
invention. 

15 hi one embodiment, the oligonucleotide is attached to a solid support 

manufactured firom, for example, glass, plastic, a nylon membr^e, a gel matrix, 
ceramics, silica, silicon, or any other non-reactive material as described in U.S. Patent 
6,787,308, the entire contents of which are incorporated by reference. The supports 
generally comprise a flat, Le,, planar, surface, or at least an array in which the 

20 molecules to be analysed are in the same plane. The oligonucleotide may be attached 
by specific covalent or non-covalent interactions. In one embodiment of the 
invention, the surface of a solid support is coated with streptavidin or avidin. In 
another embodiment of the invention, the soUd surface is coated with an epoxide and 
the molecules are coupled via an amine linkage. In yet another embodiment, the 

25 encoding oligonucleotide may be attached to a solid support via hybridization to a 
complementary nucleic acid molecule previously attached to the solid support. 

In one embodiment, the solid support is pretreated to create surface chemistry 
that facilitates oligonucleotide attachment and subsequent sequence analysis. In one 
embodiment, the solid support is coated with a polyelectrolyte multilayer (PEM). In 

30 another embodiment, the encoding oligonucleotide is attached to the surface of a 

microfabricated channel or to the surface of reaction chambers that are disposed along 
a microfabricated flow chaimel, optionally with streptavidin-biotin links. The 
methods of each of these attachment methods are described in PCT Pubhcation No. 
WO 2005/080605, the entire contents of which are incorporated by reference. 
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In one embodment, the encoding oligonucleotide is attached to a solid surface 
at high density and at single molecule resolution. In one embdoiment, the encoding 
oligonucleotide is attached to a solid surface at an individually-addressable location 
(see, e.g., PCX Publication No. WO 2005/080605). 
5 Attachment of the encoding oligonucleotide to any suitable solid surface can 

occur prior to the hybridization of a primer for amplification and/or sequencing or 
alternatively, the encoding oligonucleotide can be attached to any suitable solid 
surface after the hybridization of a primer for amplification and/or sequencing. 

In another embodiment, the oUgonucleotide is attached to a particle, such as a 

10 microsphere, which is itself attached to a solid support. The microspheres may be of 
any suitable size, typically in the range of from 10 nm to 100 nm in diameter. 

In one embodiment, the universal adaptors are not 5'-phosphorylated. 
Accordingly, "gaps" or "nicks" can be filled in by using a DNA polymerase enzyme 
that can bind to, strand displace and extend the nicked DNA fragments according to 

15 techniques recognized in the art. DNA polymerases that lack 3'-> 5' exonuclease 
activity but exhibit 5* -> 3' exonuclease activity have the ability to recognize nicks, 
displace the nicked strands, and extend the strand in a manner that results in the repair 
of the nicks and in the formation of non-nicked double-stranded DNA (Hamilton, et 
aL (2001) BioTechnigues 31:370), 

20 Several modifying enzymes are utilized for the nick repair step, including but 

not limited to polymerase, ligase and kinase. DNA polymerases that can be used for 
this application include, for example, E, coU DNA pol I, Thermoanaerobacter 
thermohydrosulfuficus pol I, and bacteriophage phi 29, In one embodiment, the strand 
displacing enzyme Bacillm stearothermophilus pol I (Bst DNA polymerase 3) is used 

25 to repair the nicked dsDNA and results in non-nicked dsDNA. In another 
embodiment, the ligase is T4 and the kinase is polynucleotide kinase. 

The invention further relates to the compoxmds which can be produced using 
the methods of the invention, and collections of such compoxmds, either as isolated 
species or pooled to form a library of chemical structures. Compoimds of the 

30 invention include compounds of the formula 
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where X is a functional moiety comprising one or more building blocks, Z is an 
oligonucleotide attached at its 3' terminus to B and Y is an oligonucleotide which is 
attached to C at its 5' terminus. A is a functional group that forms a covalent bond 
5 with X, B is a functional group that forms a bond with the 3 '-end of Z and C is a 
ftmctional group that forms a bond with the 5 '-end of Y. D, F and E are chemical 
groups that link functional groups A, C and B to S, which is a core atom or scaffold. 
Preferably, D, E and F are each independently a chain of atoms, such as an alkylene 
chain or an oligo(ethylene glycol) chain, and D, E and F can be the same or different, 

10 and are preferably effective to allow hybridization of the two oligonucleotides and 
synthesis of the functional moiety. 

Preferably, Y and Z are substantially complementary and are oriented in the 
compound so as to enable Watson-Crick base pairing and duplex formation tmder 
suitable conditions. Y and Z are the same length or different lengths. Preferably, Y 

15 and Z are the same length, or one of Y and Z is from 1 to 10 bases longer than the 
other. In a preferred embodiment, Y and Z are each 10 or more bases in length and 
have complementary regions of ten or more base pairs. More preferably, Y and Z are 
substantially complementary throughout their length, /.e., they have no more than one 
mismatch per every ten base pairs. Most preferably, Y and Z are complementary 

20 throughout their length, except for any overhang region on Y or Z, the strands 
hybridize via Watson-Crick base pairing with no mismatches throughout their entire 
length. 

S can be a single atom or a molecular scaffold. For example, S can be a 
carbon atom, a boron atom, a nitrogen atom or a phosphorus atom, or a polyatomic 
25 scaffold, such as a phosphate group or a cyclic group, such as a cycloalkyl, 

cycloalkenyl, heterocycloalkyl, heterocycloalkenyl, aryl or heteroaryl group. In one 
embodiment, the linker is a group of the structure 
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y 0P(0) 2O- — (CH2CH20)m OPO 3- 

N (CH2)n-^ 

^ 0P(0) 2O- (CH2CH20)p OFO 3 ^ 

where each of n, m and p is, independently, an integer from 1 to about 20, preferably 
from 2 to eight, and more preferably from 3 to 6. In one particular embodiment, the 
linlcer has the structtire shown below. 



— HN 




In one embodiment, the libraries of the invention include molecules consisting 
of a functional moiety composed of building blocks, where each functional moiety is 
operatively linked to an encoding oligonucleotide. The nucleotide sequence of the 

10 encoding oligonucleotide is indicative of the building blocks present in the functional 
moiety, and in some embodiments, the coimectivity or arrangement of the building 
blocks. The invention provides the advantage that the methodology used to construct 
the ftmctional moiety and that used to construct the oligonucleotide tag can be 
performed in the same reaction medium, preferably an aqueous medixrai, thus 

15 simplifymg tlie method of preparing the library compared to methods in the prior art. 
In certain embodiments in which the oligonucleotide ligation steps and the building 
block addition steps can both be conducted in aqueous media, each reaction will have 
a different pH optimum. In these embodiments, the building block addition reaction 
can be conducted at a suitable pH and temperature in a suitable aqueous buffer. The 

20 buffer can then be exchanged for an aqueous buffer which provides a suitable pH for 
oligonucleotide ligation. 

In another embodiment, the invention provides compounds, and libraries 
comprising such compounds, of Formula 11 

25 Z At— X(Y)„ 
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where X is a molecular scaffold, each Y is independently, a peripheral moiety, and n 
is an integer from 1 to 6. Each A is independently, a building block and n is an 
integer from 0 to about 5. L is a linking moiety and Z is a single- stranded or double- 
5 stranded oligonucleotide which identifies the structure -ArX(Y)n. The structure 
X(Y)n can be, for example, one of the scaffold structures set forth in Table 8 (see 
below). In one embodiment, the invention provides compounds, and hbraries 
comprising such compounds, of Formula HI: 



where t is an integer from 0 to about 5, preferably from 0 to 3, and each A is, 
independently, a building block. L is a linking moiety and Z is a single-stranded or 
double-stranded oligonucleotide which identifies each A and Ri, R2, R3 and R4. Rx, 

15 R2, R3 and R4 are each independently a substituent selected from hydrogen, alkyl, 
substituted alkyl, heteroalkyl, substituted heteroalkyl, cycloalkyl, lieterocycloalkyl, 
substituted cycloalkyl, substituted heterocycloalkyl, aryl, substituted aryl, arylalkyl, 
heteroarylalkyl, substituted arylalkyl, substituted heteroarylalkyl, heteroaryl, 
substituted heteroaryl, alkoxy, aryloxy, amino, and substituted amino. In one 

20 embodiment, each A is an amino acid residue. 

Libraries which include compoimds of Formula n or Formula III can comprise 
at least about 100; 1000; 10,000; 100,000; 1,000,000 or 10,000,000 compounds of 
Formula II or Formula III. hi one embodiment, the library is prepared via a method 
designed to produce a library comprising at least about 100; 1000; 10,000; 100,000; 

25 1,000,000 or 10,000,000 compounds of Formula II or Formula IH. 
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One advantage of the methods of the invention is that they can be used to 
prepare libraries comprising vast nimibers of compounds. The abiUty to amplify 
encoding oligonucleotide sequences using known methods such as polymerase chain 
5 reaction ("PGR") means that selected molecules can be identified even if relatively 
few copies are recovered. This allows the practical use of very large libraries, which, 
as a consequence of their liigh degree of complexity, either comprise relatively few 
copies of any given library member, or require the use of very large volumes. For 
example, a library consisting of 10^ unique structures in which each structure has 1 x 

10 10^^ copies (about 1 picomole), requires about 100 L of solution at 1 pM effective 
concentration. For the same library, if each member is represented by 1,000,000 
copies, the volume required is 100 |j.L at 1 effective concentration. 

hi a preferred embodiment, the library comprises from about 10"^ to about 10^^ 
copies of each library member. Given differences in efficiency of synthesis among 

15 the library members, it is possible that different library members will have different 
numbers of copies in any given library. Therefore, although the number of copies of 
each member theoretically present in the library may be the same, the actual number 
of copies of any given library member is independent of the number of copies of any 
other member. More preferably, the compound libraries of the invention include at 

20 least about 10^, 10^ or 10^ copies of each Ubrary member, or of substantially all 

library members. By "substantially all" library members is meant at least about 85% 
of the members of the hbrary, preferably at least about 90%, and more preferably at 
least about 95% of the members of the library. 

Preferably, the hbrary includes a sufficient number of copies of each member 

25 that multiple rounds (i.e., two or more) of selection against a biological target can be 
performed, with sufficient quantities of binding molecules remaining following the 
final round of selection to enable amplification of the oligonucleotide tags of the 
remaining molecules and, therefore, identification of the flmctional moieties of the 
binding molecules. A schematic representation of such a selection process is 

30 illustrated in Figure 6, in which 1 and 2 represent library members, B is a target 

molecule and X is a moiety operatively linked to B that enables the removal of B firom 
the selection medium. In this example, compound 1 binds to B, while compound 2 
does not bind to B. The selection process, as depicted in Round 1, comprises (I) 
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contacting a library comprising compounds 1 and 2 with B-X under conditions 
suitable for binding of compound 1 to B; (IT) removing unbound compound 2, (III) 
dissociating compound 1 from B and removing BX from the reaction medium. The 
result of Romid 1 is a collection of molecules that is enriched in compound 1 relative 
to compound 2. Subsequent rounds employing steps I-III result in ftirther enrichment 
of compound 1 relative to compoimd 2. Although three rounds of selection are shown 
in Figure 6, in practice any number of rounds may be employed, for example from 
one round to ten rounds, to achieve the desired enrichment of binding molecules 
relative to non-binding molecules. 

In the embodiment shown in Figure 6, there is no amplification (synthesis of 
more copies) of the compounds remaining after any of the rounds of selection. Such 
amplification can lead to a mixture of compoimds which is not consistent with the 
relative amounts of the compoimds remaining after the selection. This inconsistency 
is due to the fact that certain compounds may be more readily synthesized that other 
compounds, and thus may be amplified in a manner which is not proportional to their 
presence following selection. For example, if compound 2 is more readily synthesized 
than compound 1, the ampUfication of the molecules remaining after Round 2 would 
result in a disproportionate ampUfication of compound 2 relative to compound 1, and 
a resulting mixture of compounds with a much lower (if any) enrichment of 
compound 1 relative to compoimd 2. 

In one embodiment, the target is immobilized on a solid support by any known 
immobilization technique. The solid support can be, for example, a water-insoluble 
matrix contained within a chromatography column or a membrane. The encoded 
Ubrary can be applied to a water-insoluble matrix contained within a chromatography 
column. The column is then washed to remove non-specific binders. Target-bound 
compounds can then be dissociated by changing the pH, salt concentration, organic 
solvent concentration, or other methods, such as competition with a known Ugand to 
the target. 

In another embodiment, the target is free in solution and is incubated with the 
encoded library. Compounds wliich bind to the target (also referred to herein as 
"ligands") are selectively isolated by a size separation step such as gel filtration or 
ultrafiltration. In one embodiment, the mixture of encoded compounds and the target 
biomolecule are passed through a size exclusion chromatography column (gel 
filtration), which separates any ligand-target complexes from the unbound 

- 5 3 - 
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compounds. The ligand-target complexes are transferred to a reverse-phase 
chromatography column, which dissociates the ligands from the target. The 
dissociated ligands are then analyzed by PGR amplification and sequence analysis of 
the encoding oligonucleotides. This approach is particularly advantageous in 
situations where immobilization of the target may result in a loss of activity. 

Accordingly, in one aspect of the invention, methods are provided for 
identifying one or more compounds in a library of compounds, produced as described 
herein, that bind to a biological taget and subsequently determining tiie structure of 
the functional moieties of the member(s) of the library of compounds that bind to the 
biological target. 

For example, in one embodiment, one or more compounds which bind to a 
biological target can be identified by a method comprising the steps of: 

(A) , synthesizing a library of compounds, wherein the compounds comprise a 
functional moiety comprising two or more building blocks which is operatively linked 
to an initial oligonucleotide which identifies the structure of the functional moiety by: 

(i) providing a solution comprising m initiator compounds, wherein m 
is an integer of 1 or greater, where the initiator compounds consist of a functional 
moiety comprising n building blocks, where n is an integer of 1 or greater, which is 
operatively Unked to an initial oUgonucleotide which identifies the n building blocks; 

(ii) dividing the sohition of step (i) into r reaction vessels, wherein r is 
an integer of 2 or greater, thereby producing r aliquots of the solution; 

(iii) reacting, the initiator compounds in each reaction vessel with one 
of r building blocks, thereby producing r aliquots comprising compounds consisting 
of a functional moiety comprising n+1 building blocks operatively linked to the initial 
oUgonucleotide; and 

(iv) reacting the initial oligonucleotide in each aUquot with one of a set 
of r distmct incoming ohgonucleotides in the presence of an enzyme which catalyzes 
the ligation of the incoming ohgonucleotide and the initial oligonucleotide, under 
conditions suitable for enzymatic Ugation of the incoming oUgonucleotide and the 
initial oligonucleotide; thereby producing r aUquots of molecules consisting of a 
functional moiety comprising n+1 building blocks operatively linked to an elongated 
oligonucleotide which encodes the n+1 building blocks; 
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(B) contacting the biological target with the library of compounds, or a 
portion thereof, under conditions suitable for at least one member of the library of 
compounds to bind to the target; 

(C) removing library members that do not bind to the target; 

5 (D) sequencing the encoding oligonucleotides of the at least one member of 

the library of compoimds which binds to the target, and 

(E) using the sequences deteraiined in step (D> to determine the structure of 
the functional moieties of the members of the library of compounds which bind to the 
biological target, thereby identifying one or more compounds which bind to the 

10 biological target. 

In one embodiment, the method further comprises ligating a degenerate 
capping oligonucleotide to the members of the library of compounds in the presence 
of an enzyme which catalyzes the ligation and polymerizing the degenerate capping 
oligonucleotide with an enzyme that catalyzes the polymerization of DNA. 

15 In one embodiment, the method may further comprise amplifying the 

encoding oligonucleotide of the at least one member of the library of compounds 
which binds to the target prior to sequencing. 

In one embodiment of the invention, the selection and enrichment of the 
library is monitored using an oligonucleotide array. For example, a library of 

20 compounds may be hybridized to a solid surface, such as a chip comprising 

oligonucleotides, e.g., an Affymetrix oligonucleotide chip, which is subsequently 
flouresced to detect the oligonucleotide tags boimd to the surface. This hybridization 
can be repeated at each successive step of the screening process for identifying a 
compound with a desired biological activity. 

25 In one embodiment, the library of compoimds comprising encoding 

oligonucleotides which are optionally attached to capture beads as described above 
are emulsified as a heat stable water-in-oil emulsion to form a microcapsule according 
to the methods described in PCX Publications WO 2004/069849, WO 2005/003375, 
and WO 2005/073410. In one embodiment, the emulsion can be generated by 

30 suspending the oUgonucleotide tag, with or without attached beads, in amplification 
solution, e.g., forming a "microreactor." As used herein, the term "amplification 
solution" means the sufficient mixture of reagents that is necessary to perform 
amplification of template DNA. One example of an amplification solution, is a PGR 
amplification solution, that one of skill in the art can readily prepare. 
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In one embodiment of the invention, the library of compounds comprising 
encoding oligonucleotides are amplified to increase the copy number of encoding 
oligonucleotide molecules prior to sequencing. Encoding oligonucleotides may be 
amplified by any suitable method of DNA amplification including, for example, 
5 temperature cycling-polymerase chain reaction (PGR) (see, e.g-., Saiki, et al. (1995) 
Science 230:1350-1354; Gingeras, etaL WO 88/10315; Davey, etaL European Patent 
Apphcation Publication No- 329,822; Miller, et al. WO 89/06700), ligase chain 
reaction (see, e.g,, Barany (1991) Ptoc. Natl Acad, Set USA 88:189-193; Barringer, et 
ah (1990) Gene 89:117-122), transcription-based amplification (see, e.g., Kwoh, et al. 

10 (1989) Proc. Natl. Acad. Set USA 86:1 173-1 177) isothermal amplification systems - 
self-sustaining, sequence replication (see, e.g.^ Guatelli, et al. (1990) Proc. Natl. 
Acad. Sci. USA 87:1874-1878); the Qp replicase system (see, e.g., Lizardi, et al. 
(1988) BioTechnology 6: 1 197-1202); strand displacement amplification (Walker, et 
al (1992) Nucleic Acids Res 20(7):1691-6; the methods described by Walker, et al 

15 (Proc, Natl Acad. Scl USA (1992) l:89(l):392-6; the methods described by Kievits, 
et al (J Virol Methods (1991) 35(3):273-86; "race" (Frohman, In: PGR Protocols: A 
Guide to Methods and Applications, Academic Press, NY (1990)); "one-sided PGR" 
(Ohaxa, et al (1989) Proc. Natl. Acad. Set U.S.A. 86.5673-5677); "di- 
oligonucleotide" amplification, isothermal amplification (Walker, et al. (1992) Proc. 

20 Natl. Acad. Sci. U.S.A. 89:392-396), and rolling circle amplification (reviewed in U.S. 
Patent No. 5,714,320). - 

In one embodiment, the library of compoimds comprising encoding 
oKgonucleotides is amplified prior to sequence analysis in order to minimize any 
potential skew in the population distribution of DNA molecules present in the selected 

25 library mix. For example, only a small amount of library is recovered afl:er a selection 
step and is typically amplified using PGR prior to sequence analysis. PGR has the 
potential to produce a skew in the population distribution of DNA molecules present 
in the selected library mix. This is especially problematic when the nxunber of input 
molecules is small and the input molecules are poor PGR templates, PGR products 

30 produced at early cycles are more efficient templates than covalent duplex library, and 
therefore the frequency of these molecules in the final amplified population may be 
much higher than in original input template. 

Accordingly, in order to minimize this potential PGR skew, in one 
embodiment of the invention^, a population of single-stranded oligonucleotides 
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corresponding to the individual library members is produced by, for example, using 
one primer in a reaction, followed by PGR amplification using two primers. By doing 
so, there is a linear accumulation of single-stranded primer-extension product prior to 
exponential amplification using PGR, and the diversity and distribution of molecules 
5 in the accumulated primer-extension product more accurately reflect the diversity and 
distribution of molecules present in the original input template, since the exponential 
phase of amplification occurs only after much of the original molecular diversity 
present is represented in the population of molecules produced during the primer- 
extension reaction. 

10 Preferably, DNA amplification is performed by PGR. PGR amplification 

methods are described in detail in U.S. Patent Nos. 4,683,192, 4,683,202, 4,800,159, 
and 4,965,188, and at least in PGR Technology: Principles and Applicationsifor DNA 
Amplification, H. Erlich, ed., Stockton Press, New York (1989); and PGR Protocols: 
A Guide to Methods and Applications, Ihnis et al, eds.. Academic Press, San Diego, 

15 Galif. (1990). The contents of all the foregoing documents are incorporated herein by 
reference, hi one embodiment of the invention, PGR ampHfication of the template is 
performed on an oHgonucleotide tag bound to a bead, and encapsulated with a PGR 
solution comprising all the necessary reagents for a PGR reaction, hi another 
embodiment of the invention, PGR amplification of the template is performed on a 

20 soluble oligonucleotide tag {i.e., not bound to a bead) which is encapsulated with a 
PGR solution comprising all the necessary reagents for a PGR reaction. PGR is 
subsequently performed by exposing the emulsion to any suitable fhermocycling 
regimen known in the art. In one embodiment, between 30 and 50 cycles, preferably 
about 40 cycles, of amplification are performed. It is desirable, but not necessary, that 

25 following the amplification procedure there be one or more hybridization and 

extension cycles following the cycles of amplification, in a another embodiment, 
between 10 and 30 cycles, or about 25 cycles, of hybridization and extension are 
performed. In one embodiment, the template DNA is amplified xmtil about at least 
two million to fifty million copies or about ten million to thirty million copies of the 

30 template DNA are immobilized per bead. 

Following amplification of the encoding oligonucleotide tag, the emulsion is 
"broken" (also referred to as "demulsification'' in the art). There are many well known 
methods of breaking an emulsion (see, e,g,, U.S. Patent No. 5,989,892 and references 
cited therein) and one of skill in the art would be able to select the proper method. 
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For example, the emulsion may be broken by adding additional oil to cause the 
emulsion to separate into two phases. The oil phase is then removed, and a suitable 
organic solvent (e.g., hexanes) is added. After mixing, the oil/organic solvent phase is 
removed. This step may be repeated several times. Finally, the aqueous layers is 
removed. If the encoding oligonucleotides are attached to beads, the beads are then 
washed with an organic solvent /annealing buffer mixture, and then washed again in 
annealing buffer. Suitable organic solvents include alcohols such as methanol, 
ethanol and the like. 

The amplified encoding oligonucleotides may then be resuspended in aqueous 
solution for use, for example, in a sequencing reaction according to known 
technologies. (See, e.g., Sanger, F. etaL (1977) Proc. Natl. Acad. ScL U.S.A. 
75:5463-5467; Maxam & Gilbert (1977) Proc Natl Acad Sci USA 74:560-564; 
Ronaghi, et ah (1998) Science 281 :363, 365; Lysov, et ah (1988) DoklAkadNauk 
SSSR 303:1508-1511; Bains & Smith (198$) J TheorBiol 135:303-307; Dmanac, R. 
etaL (1989) Genomics 4:114A28; Khrapko, et aL (1989) FEES Lett 256:118-122 ; 
Pevzner (1989) JBiomol Struct Dyn 7:63-73; Southem, et aL (1992) Genomics^ 
13:1008-1017). : 

If the encoding oligonucleotide attached to a bead is to be used in a 
pyrophosphate-based sequencing reaction (described, e.g.^ in US patent No. 
6,274,320, 6258,568 and 6,210,891, and incorporated herein by reference), then it is 
necessary to remove the second strand of the BCR product and aimeal a sequencing 
primer to the single stranded template that is boimd to the bead. 

Briefly, the second strand is melted away using any nimiber of commonly 
known methods such as NaOH, low iomc (e.g.^ salt) strength, or heat processing. 
Following this melting step, the beads are pelleted and the supernatant is discarded. 
The beads are resuspended in an annealing buffer, the sequencing primer added, and 
armealed to the bead-attached single stranded template using a standard annealing 
cycle. 

The amplified encoding oligonucleotide, optionally on a bead, may be 
sequenced either directly or in a different reaction vessel. In one embodiment of the 
present invention, the encoding oligonucleotide is sequenced directly on the bead by 
transferring the bead to a reaction vessel and subjecting the DNA to a sequencing 
reaction (e.g., pyrophosphate or Sanger sequencing). Alternatively, the beads may be 
isolated and the encoding oligonucleotide may be removed from each bead and 
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sequenced. Nonetheless, the sequencing steps may be performed on each individual 
bead and/or the beads that contain no nucleic acid template may be removed prior to 
distribution to a reaction vessel by, for example, biotin-streptavidin magnetic beads. 
Other suitable methods to separate beads are described in, for example, Bauer, J. 
5 (1999) J. Chromatography B, 722:55-69 and in Brody et al. (1999) Applied Physics 
Lett. 74:144-146. 

Once the encoding oligonucleotide tag ha§ been amplified, the sequence of the 
tag, and ultimately the composition of the selected molecule, can be determined using 
nucleic acid sequence analysis, a well knoAvn procedure for determining the sequence 

10 of nucleotide sequences. Nucleic acid sequence analysis is approached by a 
combination of (a) physiochemical techniques, based on the hybridization or 
denaturation of a probe strand plus its complementary target, and (b) enzymatic 
reactions with polymerases. 

The nucleotide sequence of the oligonucleotide tag comprised of 

15 polynucleotides that identify the building blocks that make up the functional moiety 
as described herein, may be determined by the use of any sequencing method known 
to one of skill in the art. Suitable methods are described in, for example, Sanger, F. et 
aL (1977) Proc. Natl Acad. ScL U,S.A. 75:5463-5467; Maxam & Gilbert (1977) Proc 
Natl Acad Sci USA 74:560-564; Ronaghi, etal (1998) Science 281:363, 365; Lysov, 

20 etaLi\9%%)DoklAkadNaukSSSR303:\50i'l5n\B^ms&S 

TheorBiol 135:303-307; Dmanac, R. et aL (1989) Genomics 4:114-128; Khrapko, et 
al (1989) FEBS Lett 256:118-122 ;^GyznQr(19S9) J Biomol Struct Dyn 7:63-73; 
Southern, et aL (1992) Genomics 13:1008-1017). 

In a preferred embodiment, the oligonucleotide tags are sequenced using the 

25 apparati and methods described in PCX publications WO 2004/069849, WO 

2005/003375, WO 2005/073410, and WO 2005/054431, the entire contents of each of 
which are incorporated herein by this reference. 

In one embodiment, a region of the sequence product is determined by 
annealing a sequencing primer to a region of the template nucleic acid, and then 

30 contacting the sequencing primer with a DNA polymerase and a known nucleotide 
triphosphate, le., dATP, dCTP, dGTP, dTTP, or an analog of one of these 
nucleotides, such as, for example, a-thio-dATP. The sequence can be determined by 
detecting a sequence reaction byproduct, using methods known in the art. 
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In some embodiments, the nucleotide is modified to contain a disulfide- 
derivative of a hapten, such as biotin. The addition of the modified nucleotide to the 
nascent primer annealed to an anchored substrate is analyzed by a suitable post- 
polymerization method. Such methods enable a nucleotide to be identified in a given 
5 target position, and the DNA to be sequenced simply and rapidly while avoiding the 
need for electrophoresis and the use of potentially dangerous radiolabels- 

Examples of suitable haptens include, for example, biotin, digoxygenin, the 
fluorescent dye molecules cy3 and cy5, and fluorescein. The attachment of the hapten 
can occur througji linkages via the sugar, the base, and/or via the phosphate moiety on 

10 the nucleotide. Exemplary means for signal amplification following polymerization 
and extension of the encoding oligonucleotide include fluorescent, electrochemical 
and enzymatic means. In one embodiment using enzymatic amplification, the enzjme 
is one for which light-generating substrates are known, such as, for example, alkaline 
phosphatase (AP), horse-radish peroxidase (HRP), beta-galactosidase, or luciferase, 

15 and the means for the detection of these light-generating (chemilxmiinescent) 
substrates can include a CCD camera. 

A sequencing primer can be of any length or base composition, as long as it is 
capable of specifically annealing to a region of the nucleic acid template the 
oligonucleotide tag). The oligonucleotide primers of the present invention may be 

20 synthesi:^ed by conventional technology, e.g., with a commercial oligonucleotide 
synthesizer and/or by ligating together subfragments that have been so synthesized. 
No particular stracture for the sequencing primer is required so long as it is able to 
specifically prime a region on the template nucleic acid. The sequencing primer is 
extended with the DNA polymerase to form a sequence product. The extension is 

25 performed in the presence of one or more types of nucleotide triphosphates, and if 
desired, auxiliary binding proteins. Incorporation of the dNTP is determined by, for 
example, assaying for the presence of a sequencing byproduct. 

In one embodiment, the nucleic acid sequence of the oligonucleotide tag is 
determined by the use of the polymerase chain reaction (PGR). Briefly, the 

30 oligonucleotide tag (optionally attached to a bead) is subjected to a PGR reaction as 
follows. The appropriate sample is contacted with a PGR primer pair, each member 
of the pair having a pre-selected nucleotide sequence. The PGR primer pair is capable 
of initiating primer extension reactions by hybridizing to a PGR primer binding site 
on the encoding oligonucleotide tag. 
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The PGR reaction is performed by mixing the PGR primer pair, preferably a 
predetermined amount thereof, with the nucleic acids of the encoding oligonucleotide 
tag, preferably a predetermined amount thereof, in a PGR buffer to fomi a PGR 
reaction admixture. The admixture is thermocycled for a number of cycles, which is 
typically predetermined, sufficient for the formation of a PGR reaction product. A 
sufficient amount of product is one that can be isolated in a sufjBcient amount to allow 
for DNA sequence determination. 

PGR is typically carried out by thermocycling repeatedly increasing ^d 
decreasing the temperature of a PGR reaction admixture within a temperature range 
whose lower hmit is about 30 ""C to about 55 °G and whose upper limit is about 90 ""C 
to about 100 °G. The increasing and decreasing can be continuous, but is preferably 
phasic with time periods of relative temperature stability at each of temperatures 
favoring polynucleotide sjnthesis, denaturation and hybridization. 

The PGR reaction is performed using any suitable method. Generally it occurs 
in a buffered aqueous solution, Le., a PGR buffer, preferably at a pH of 7-9. 
Preferably, a molar excess of the primer is present. A large molar excess is preferred 
to improve the efficiency of the process. 

The PGR buffer also contains the deoxyribonucleotide triphosphates 
Q)olynucleotide synthesis substrates) dATP, dGTP, dGTP, and dTTP and a 
polymerase, typically thermostable, all in adequate amounts for primer extension 
(polynucleotide synthesis) reaction. The resulting solution (PGR admixture) is heated 
to about 90° G-100° G for about 1 to 10 minutes, preferably from 1 to 4 minutes. 
After this heating period the solution is allowed to cool to 54*" G, which is preferable 
for primer hybridization. The synthesis reaction may occur at a temperature ranging 
from room temperature up to a temperature above which the polymerase (inducing 
agent) no longer functions efficiently. Thus, for example, if DNA polymerase is used, 
the temperature is generally no greater than about 40° G. The thermocycling is 
repeated until the desired amoimt of PGR product is produced. An exemplary PGR 
buffer comprises the following reagents: 50 mM KGl; 10 mM Tris-HGl at pH 8.3; 1,5 
mM MgGl.sub.2 ; 0.001% (wt/vol) gelatin, 200 ^iM dATP; 200 |liM dTTP; 200 pM 
dGTP; 200 p,M dGTP; and 2.5 units Thermus aquaticus (Tag) DNA polymerase I per 
100 microliters of buffer. 

Suitable enzymes for elongating the primer sequences include, for example, E. 
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coli DNA polymerase I, Taq DNA polymerase, Klenow jfragment of E. colt DNA 
polymerase I, T4 DNA polymerase, other available DNA polymerases, reverse 
transcriptase, and other enzymes, including heat-stable enzymes, which will facilitate 
combination of the nucleotides in the proper manner to form the primer extension 
5 products which are complementary to each nucleic acid strand. Generally, the 

synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction 
along the template strand, until synthesis terminates, produciag molecules of different 
lengths. The newly synthesized DNA strand and its complementary strand form a 
double-stranded molecule which can be used in the succeeding steps of the analysis 
10 process. 

In one embodiment, the nucleotide sequence of the oligonucleotide tag is 
determined by measuring inorganic pyrophosphate (PPi) liberated from a nucleotide 
triphosphate (dNTP) as the dNMP is incorporated into an extended sequence primer. 
This method of sequencing, termed Pyrosequencing""^ technology (PyroSequencing 

15 AB, Stockholm, Sweden) can be performed in solution (liquid phase) or as a solid 
phase technique. PPi-based sequencing methods are described in, e.g,^ U.S. Patents 
6,274,320, 6258,568 and 6,210,891, W09813523A1, Ronaghi, et al (1996) Anal 
Biochem. 242:84-89, Ronaghi, et al (1998) Science 281:363-365, and USSN 
2001/0024790. These disclosures of PPi sequencing are incorporated herein in their 

20 entirety, by reference. See also, e.g., US patents 6,210,891 and 6,258,568, each of 
which are fully incorporated herein by this reference. 

Pyrophosphate can be detected by a number of different methodologies, and 
various enzymatic methods have been previously described (see e.g^.. Reeves, et aL 
{1969) Anal Biochem, 28:282-287; Guillory, etah {1911) Anal Biochem, 39:170-180; 

25 Johnson, et aL (1968) Anal Biochem, 15:273; Cook, et aL 1978. Anal Biochem. 
91:557-565; and Drake, etaL {1919) Anal Biochem. 94: 117-120). 

hi one embodiment, PPi is detected enzymatically {e,g.^ by the generation of 
light). Such methods enable a nucleotide to be identified in a given target position, 
and the DNA to be sequenced simply and rapidly while avoiding the need for 

30 electrophoresis and the use of potentially dangerous radiolabels. 

In one embodiment, the PPi and a coupled luciferase-luciferin reaction is used 
to generate light for detection. In another embodiment, the PPi and a coupled 
sulfurylase/luciferase reaction is used to generate light for detection as described in 
U.S. Patent 6,902,921, the contents of which are hereby expressly incorporated herein 
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by reference. In one embodiment, the sulftirylase is thermostable. In some 
embodiments, either or both the sulfurylase and luciferase are immobilized on one or 
more mobile solid supports disposed at each reaction site. 

In another embodiment, the nucleotide sequence of the oligonucleotide tag 
5 may be determined according to the methods described in PCT Publication No. WO 
01/23610, the contents of which are incorporated herein by reference. Briefly, a 
target nucleotide sequence can be determmed by generatmg its complement using the 
polymerase reaction to extend a suitable primer, and characterizing the successive 
incorporation of bases that generate the complement sequence. The target sequence 
10 is, typically, immobilized on a solid support. Each of the different bases A, T, G; or C 
is then brought, by sequential addition, into contact with the target, and any 
incorporation events are detected via a suitable label attached to the base. 

A labeled base is incorporated into the complementary sequence by the use of 
a polymerase, e.g„ a polymerase with a 3 ' to 5' exonuclease activity {e.g., DNA 
1 5 polymerase I, the Klenow fragment, DNA polymerase HI, T4 DNA polymerase, and 
T7 DNA polymerase). Following detection of the incorporated labeled base, the 
polymerase replaces the terminally labeled base with a corresponding unlabelled base, 
thus permitting further sequencing to occur. 

In yet another embodiment, the nucleotide sequence of the oligonucleotide tag 
20 is determined by the use of single molecule sequencing by synthesis methods 

described in, for example, PCT Publication No, WO 2005/080605, the enthre contents 
of which are expressly incorporated by reference. The benefit of using this 
technology is that it eliminates the need for DNA amplification prior to sequencing, 
thus, abolishing the introduction of ampUfication errors and bias. Briefly, the 
25 encoding oligonucleotide is hybridized to a universal primer immobiUzed on a solid 
surface. The oUgonucleotide:primer duplexes are visuaUzed by, e.g., illuminating the 
siirface with a laser and imaging with a digital TV camera connected to a microscope, 
and the positions of all the duplexes on the surface are recorded. DNA polymerase 
and one type of fluorescently labeled nucleotide, e.g.. A, is added to the surface and 
30 incorporated into the appropriate primer. Subsequently, the polymerase and the 
unincorporated nucleotides are washed firom the surface and the incorporated 
nucleotide is visualized by, e.g., illuminating the surface with a laser and imaging 
with a camera as before to record the positions of the incorporated nucleotides. The 
fluorescent label is removed from each incorporated nucleotide and the process is 
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repeated with the next nucleotide, e.g., G, stepping through A, C, G, T, until the 
desired read-length is achieved. 

One group of fluorescent dyes suitable for this method of sequencing is 
fluorescence resonance energy transfer (FRET) dyes, including donor and acceptor 
energy fluorescent dyes and Unkers such as, for example, Cy3 and Cy5. FRET is a 
phenomenon described in, for example, Selvin (1995) Methods in Enzym, 246:300. 
FRET can detect the incorporation of multiple nucleotides into a single 
oligonucleotide molecule and is, thus, useful for sequencing the encoding 
oUgonucleotides of the invention. Sequencing methods using FRET are described in, 
for example, PCT Publication No. WO 2005/080605, the entire contents of which are 
expressly incorporated by reference. Alternatively, quantum dots can be used as a 
labeling moiety on the different t5fpes of nucleotides for use in sequencing reactions. 

Once single ligands are identified by the above-described process, various 
levels of analysis can be applied to yield structure-activity relationship information 
and to guide further optimization of the affinity, specificity and bioactivity of the 
ligand. For Ugands derived from the same scaffold, three-dimensional molecular 
modeling can be employed to identify significant structural features common to the 
ligands, thereby generating famiUes of small-molecule ligands that presumably bind 
at a common site on the target biomolecule. 

A variety of screening approaches can be used to obtain ligands that possess 
high affinity for one target but significantly weaker affinity for another closely related 
target. One screening strategy is to identify ligands for both biomolecules in parallel 
experiments and to subsequently eUminate common ligands by a cross-referencing 
comparison. In this method, ligands for each biomolecule can be separately identified 
as disclosed above. This method is compatible with both immobiUzed target 
biomolecules and target biomolecules free in solution. 

For immobilized target biomolecules, another strategy is to add a preselection 
step that eliminates all ligands that bind to the non-target biomolecule from the 
library. For example, a first biomolecule can be contacted with an encoded library as 
described above. Compounds which do not bind to the first biomolecule are then 
separated from any first biomolecule-ligand complexes which form. The second 
biomolecule is then contacted with the compounds which did not bind to the first 
biomolecule. Compounds which bind to the second biomolecule can be identified as 
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described above and have significantly greater affinity for the second biomolecule 
than to the first biomolecule. 

A ligand for a biomolecule of unknown function which is identified by the 
method disclosed above can also be used to determine the biological function of the 
5 biomolecule. This is advantageous because although new gene sequences continue to 
be identified, the functions of the proteins encoded by these sequences and the 
validity of these proteins as targets for new drug discovery and development are 
difficult to determine and represent perhaps the most significant obstacle to applying 
genomic information to the treatment of disease. Target-specific ligands obtained 

10 through the process described in this invention can be effectively employed in whole 

cell biological assays or in appropriate animal models to understand both the function • 
of the target protein and the validity of the target protein for therapeutic intervention. 
This approach can also confirm that the target is specifically amenable to small 
molecule drug discovery. 

15 , In one embodiment,, one or more compoimds within a library of the invention 

are identified as ligands for a particular biomolecule . These compounds can then be 
assessed in an in vitro assay for the ability to bind to the biomolecule. Preferably, the 
fimctional moieties of the binding compounds are synthesized without the 
oligonucleotide tag or linker moiety, and these functional moieties are assessed for the 

20 abiUty to bind to the biomolecule. 

The effect of the binding of the functional moieties to the biomolecule on the 
function of the biomolecule can also be assessed using in vitro cell-firee or cell-based 
assays. For a biomolecule having a known function, the assay can include a 
comparison of the activity of the biomolecule in the presence and absence of the 

25 ligand, for example, by direct measurement of the activity, such as enzymatic activity, 
or by an indirect measure, such as a cellular function that is influenced by the 
biomolecule. If the biomolecule is of unknown function, a cell which expresses the 
biomolecule can be contacted with the ligand and the effect of the ligand on the 
viabiUty, function, phenotype, and/or gene expressionof the cell is assessed. The in 

30 vitro assay can be, for example, a cell death assay, a cell proliferation assay or a viral 
repUcation assay. For example, if the biomolecule is a protein expressed by a virus, a 
cell infected with the virus can be contacted with a ligand for the protein. The affect 
of the binding of the ligand to the protein on viral viability can then be assessed. 

A ligand identified by the method of the invention can also be assessed in an 
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in vivo model or in a human. For example, the ligand can be evaluated in an animal or 
organism which produces the biomolecule. Any resulting change in the health status 
(e.g., disease progression) of the animal or organism can be determined. 

For a biomolecule, such as a protein or a nucleic acid molecule, of unknown 
5 function, the effect of a ligand which binds to the biomolecule on a cell or organism 
which produces the biomolecule can provide information regarding the biological 
function of the biomolecule. For example, the observation that a particular cellular 
process is inhibited in the presence of the ligand indicates that the process depends, at 
least in part, on the ftinction of the biomolecule. 

10 Ligands identified using the methods of the invention can also be used as 

affinity reagents for the biomolecule to which they bind. In one embodiment, such 
ligands are used to effect affinity purification of the biomolecule, for example, via 
chromatography of a solution comprising the biomolecule using a solid phase to 
which one or more such ligands are attached. 

15 In addition to the screening of encoded libraries as described herein, other 

traditional dmg discovery methods, such as phage display, differential display 
(mRNA display), and aptamer/SELEX, could benefit firom the methods of the 
invention which eliminate the introduction of amplification errors and biases. For 
example, multiple rounds of selection using phage display (described in, for example, 

20 PCX Publication Nos. WO91/18980, W091/19818, and W092/18619, and U.S. Patent 
No. 5223409, the entire contents of each of whdch are incorporated herein by 
reference) can cause host toxicity and, consequently, loss or imder-representation of 
desired library members (see, e.g., Daugherty, P.S., et aL (1999) Protein Engineering 
12(7):613-621 and Holt, L.J., et al (2000) Nucleic Acids Res. 28(15):E72). 

25 Moreover, methods such as Systematic Evolution of Ligands by Exponential 

enrichment (also known as SELEX which is described in, for example, U.S. Patents 
5654151, 5503978, 5567588 and 5270163, as well as PCX Publication Nos. WO 
96/38579 and W09927133A1, the entire contents of each of which are incorporated 
herein by reference) introduce biases due to the need for multiple rounds of selection, 

30 i.e., partitioning unbound nucleic acids firom those nucleic acids which have boimd 
specifically to a target molecule, and multiple rounds of amplification of the nucleic 
acids that have bound to the target by reverse transcription and PGR . Similarly, 
methods of selection like differential display (described in, for example, U.S. Patents 
5580726 and 5700644, the entire contents of each of which are incorporated herein by 
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reference) rely on multiple rounds of PGR amplification which also leads to unequal 
representation of the clones in the library. Thus, the foregoing multi-step selection 
processes may benefit firom the methods described herein which employ massively 
parallel sequencing approaches (such as, for example, a pyrophosphate-based 
sequencing method or a single molecule sequencing by synthesis method) which leads 
to the accurate identification of a compound with a desired biological activity without 
the need for any nucleic acid ampUfication. 

This invention is further illustrated by the following examples which should 
not be construed as limiting. The contents of all references, patents and published 
patent applications cited throughout this application, as well as the Figures and the 
Sequence Listing, are hereby incorporated in reference. 



Example 1 : Synthesis and Characterization of a library on the order of 10^ members 

The synthesis of a library comprising on the order of 10^ distinct members was 
accomplished using the following reagents: 

Compound 1: 



Single letter codes for deoxyribonucleotides: 
A = adenosine 
C == cytidine 
G = guanosine 
T = thymidine 



Building block precursors: 



Examples 



H2N 




/ O-TGACTCCCAAATCAATGTG-3' 



/ 0-ACTGAGGGTTTAGTTAC-P04-5' 



(1) 
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BBl 



BB2 



Fmoc^ ^OH 

o 



BB3 



Y 



FmoCv 




NH2 



NH2 



OH 



Fmoc^ 




OH 



BB4 



o 

BBS 




FmoCv^ --OH 
O 



Fmoc' 




OH 



10 BB7 



o 

BBS 




Fmoc 



BB9 




15 



Fmoc^ ^OH 
O 



BBIO 




OH 



Fmoc 




BBll 



BB12 



20 Oligonucleotide tags: 
Sequence 



Tag number 
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5'-P04-GCAACGAAG (SEQ ID N0:1) 1.1 
ACCGTTGCT-PO3-5' (SEQ ID NO: 2) 

5' -PO3-GCGTACAAG (SEQ ID NO: 3) . 1.2 
5 ACCGCATGT-PO3-5' (SEQ ID NO: 4) 

S'-POa-GCTCTGTAG (SEQ ID NO: 5) 1.3 
ACCGAGACA-PO3-5' (SEQ, ID NO: 6) 

10 5'-P03-GTGCCATAG (SEQ ID NO: 7)' 1.4 
ACCACGGTA-PO3-5' (SEQ ID NO: 8) 



15 



30 



45 



5'-P03-GTTGACCAG (SEQ ID NO: 9) 1.5 
ACCAACTGG-POj-5' (SEQ ID NO: 10) 

5' -PO3-CGACTTGAC (SEQ ID NO: 11) 1.6 
CAAGTCGCA-PO3-5' (SEQ ID NO: 12) 



5' -PO3-CGTAGTCAG (SEQ ID NO: 13) 1.7 
20 ACGCATCAG-P03-5'' (SEQ ID NO: 14) 

5'-P03-CCAGCATAG (SEQ ID NO: 15) 1.8 
ACGGTCGTA-PO3-5' (SEQ ID NO: 16) 

25 5' -PO3-CCTACAGAG (SEQ ID NO: 17) 1.9 
ACGGATGTC-PO3-5' (SEQ ID NO: 18) 



5'-P03-CTGAACGAG (SEQ ID NO: 19) 1.10 
CGTTCAGCA-PO3-5' (SEQ ID NO: 20) 

5' -PO3-CTCCAGTAG (SEQ ID N0:21) 1.11 
ACGAGGTCA-POs-S' (SEQ ID NO: 22) 



5'-P03-TAGGTCCAG (SEQ ID NO: 23) 1.12 
35 ACATCCAGG-Pb3-5' (SEQ ID NO: 24) 

S'-POj-GCGTGTTGT (SEQ ID NO: 25) 2.1 
TCCGCACAA-PO3-5' (SEQ ID NO: 26) 

40 5' -PO3-GCTTGGAGT (SEQ ID NO: 27) 2.2 
TCCGAACCT-PO3-5' (SEQ ID NO: 28) 



5'-P03-GTCAAGCGT (SEQ ID NO: 29) 2.3 
TCCAGTTCG-PO3-5' (SEQ ID NO: 30) 

5'-P03-CAAGAGCGT (SEQ ID NO: 31) 2.4 
TCGTTCTCG-PO3-5' (SEQ ID NO: 32) 

5'-P03-CAGTTCGGT (SEQ ID NO: 33) 2.5 
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TCGTCAAGC-PO3-5' (SEQ ID NO: 34) 

S'-POa-CGAAGGAGT (SEQ ID NO: 35) 2.6 

TCGCTTCCT-PO3-5' (SEQ ID NO: 36) 

5 

5'-P03-CGGTGTTGT (SEQ ID NO: 37) 2.7 

TCGCCACAA-PO3-5' (SEQ ID NO: 38) 

S'-POs-CGTTGCTGT (SEQ ID NO: 39) 2.8 

10 TCGCAACGA-PO3-5' (SEQ ID NO: 40) 

5'-P03-CCGATCTGT (SEQ ID NO: 41) 2.9 

TCGGCTAGA-PO3-5' (SEQ ID NO: 42) 

15 5'-P03-CCTTCTCGT (SEQ ID NO: 43) 2.10 

TCGGAAGAG-PO3-5' (SEQ ID NO: 44) 



20 



35 



5'-P03-TGAGTCCGT (SEQ ID NO: 45) 2.11 
TCACTCAGG-PO3-5' (SEQ ID NO: 45) 

S'-POa-TGCTACGGT (SEQ ID NO: 47) 2.12 
TCAGATTGC-PO3-5' (SEQ ID NO: 48) 



5'-P03-GTGCGTTGA (SEQ ID NO: 4 9) 3.1 
25 CACACGCAA-PO3-5' (SEQ ID NO: 50) 

5''-P03-GTTGGCAGA (SEQ ID NO: 51) 3.2 
CACAACCGT-PO3-S' (SEQ ID NO: 52) 

30 S'-POs-CCTGTAGGA (SEQ ID N0:53> 3.3 
CAGGACATC-PO3-5'' (SEQ ID NO: 54) 

5'-P03-CTGCGTAGA (SEQ ID NO: 55) 3.4 
CAGACGCAT-PO3-5' (SEQ ID NO: 55) 



5'-P03-CTTACGCGA (SEQ ID NO: 57) 3.5 
CAGAATGCG-PO3-5' (SEQ ID NO: 58) 



5'-P03-TGGTCACGA (SEQ ID NO: 59) 3.6 

40 CAACCAGTG-PO3-5' (SEQ ID NO: 60) 

5' -PO3-TCAGAGCGA (SEQ ID NO: 61) 3.7 

CAAGTCTCG-PO3-5' (SEQ ID NO: 52) 

5' -PO3-TTGCTCGGA (SEQ ID NO: 63) 3.8 

45 CAAACGAGC-PO3-5' (SEQ ID NO: 54) 

5''-P03-GCAGTTGGA (SEQ ID NO: 65) 3.9 

CACGTCAAC-PO3-5' (SEQ ID NO: 66) 
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5'-P03-GCGTGAAGA (SEQ ID NO: 67) 3.10 
CACGGACTT-PO3-5' (SEQ ID NO: 68) 

5' -POa-GTAGCCAGA (SEQ ID NO: 69) 3.11 
5 CACATCGGT-PO3-5' (SEQ ID NO: 70) 

5' -POs-GTCGCTTGA (SEQ ID NO: 71) 3.12 
CACAGCGAA-PO3-5' (SEQ ID NO: 72) 

10 5' -PO3-GCCTAAGTT (SEQ ID NO: 73) 4.1 
CTCGGATTC-PO3-5' (SEQ ID NO: 74) 



15 



30 



45 



5' -POs-GTAGTGCTT (SEQ ID NO:75) 4.2 
CTCATCACG-PO3-5' (SEQ ID NO: 7 6) 

5' -PO3-GTCGAAGTT (SEQ ID NO: 77) 4,3 
CTCAGCTTC-POs-S' (SEQ ID NO: 78) 



5' -PO3-GTTTCGGTT (SEQ ID NO: 79) 4.4 
20 CTCAAAGCC-PO3-5'' (SEQ ID NO: 80) 

5' -PO3-CAGCGTTTT (SEQ ID NO: 81) 4.5 
CTGTCGCAA-PO3-5' (SEQ ID NO: 82) 

25 5' -PO3-CATACGCTT (SEQ ID NO: 83) 4.6 
CTGTATGCG-PO3-5' (SEQ ID NO: 84) 



5'~P03-CGATCTGTT (SEQ ID NO: 85) 4.7 
CTGCTAGAC-PO3-5'' (SEQ ID NO: 86) 

5'-P03-CGCTTTGTT (SEQ ID NO: 87) 4.8 
CTGCGAAAC-PO3-5' (SEQ- ID NO: 88) 



5'-P03-CCACAGTTT (SEQ ID NO: 89) 4,9 
35 CTGGTGTCA-PO3-5' (SEQ ID NO: 90) 

5'-P03-CCTGAAGTT (SEQ ID NO: 91) 4.10 
CTGGACTTC-P03~5' (SEQ ID NO: 92) 

40 5' -PO3-CTGACGATT (SEQ ID NO: 93) 4.11 
CTGACTGCT-PO3-5' (SEQ ID NO: 94) 



5'-P03-CTCCACTTT (SEQ ID NO: 95) 4.12 
CTGAGGTGA-PO3-5' (.SEQ ID NO: 96) 

5'-P03-ACCAGAGCC (SEQ ID NO: 97) 5.1 
AATGGTCTC-PO3-5" (SEQ ID NO: 98) 

5'-P03-ATCCGCACC (SEQ ID NO: 99) 5.2 
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AATAGGCGT-PO3-5' (SEQ ID NO: 100) 

5' -PO3-GACGACACC (SEQ ID NO: 101) 5.3 

AACTGCTGT-PO3-5' (SEQ ID NO: 102) 

5 

5' -PO3-GGATGGACC (SEQ ID NO: 103) 5.4 

AACCTACCT-POs-S' (SEQ ID NO: 104) 

5' -PO3-GCAGAAGCC (SEQ ID NO:105) 5.5 
10 AACGTCTTC-POs-S' (SEQ ID NO: 106) 

5' -PO3-GCCATGTCC (SEQ ID NO: 107) 5.6 
AACGGTACA-PO3-5' (SEQ ID NO: 108) 

15 5' -PO3-GTCTGCTCC (SEQ ID NO: 109) 5.7 
AACAGACGA-PO3-5' (SEQ ID NO: 110) 

5' -PO3-CGACAGACC (SEQ ID NO: 111) 5.8 
AAGCTGTCT-PO3-5' (SEQ ID NO: 112) 

20 

5'-P03-CGCTACTCC (SEQ ID NO: 113) 5.9 
AAGCGATGA-PO3-5' (SEQ ID NO: 114) 

5'-P03-CCACAGACC (SEQ ID NO: 115) 5.10 
25 AAGGTGTCT-PO3-5' (SEQ ID NO: 116) 

5' -pOg-CCTCTCTCC (SEQ ID NO: 117) 5.11 
AAGGAGAGA-PO3-5' (SEQ ID NO: 118) 

30 S'-POs-CTCGTAGCC (SEQ ID NO: 119) 5.12 
AAGAGCATC-PO3-5' (SEQ ID NO: 120) 

IX Ugase bxiffer: 50 mM Tris, pH 7.5; 10 mM dithiothreitol; 10 mM MgCl2; 2.5 mM 
ATP; SOtnMNaCl. 

35 

lOX ligase btiffer: 500 mM Tris, pH 7.5; 100 mM dithiothreitol; 100 mMMgCh; 25 
mM ATP; 500 mM NaCl 



Cycle 1 

40 

To each of twelve PGR tubes was added 50 |aL of a 1 mM solution of 
Compound 1 in water; 75 jaL of a 0.80 mM solution of one of Tags 1,1-1.12; 15 jxL 
lOX ligase buffer and 10 pL deionized water. The tubes wore heated to 95 °C for 1 
minute and then cooled to 16 °C over 10 minutes. To each tube was added 5,000 
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units T4 DNA ligase (2.5 jliL of a 2,000,000 unit/mL solution (New England Biolabs, 
Cat. No. M0202)) in 50 p,! IX ligase buffer and the resulting solutions were incubated 
at 16 ""C for 16 hours. 

Following ligation, samples were transferred to 1.5 ml Eppendorf tubes and 
treated with 20 pL 5 M aqueous NaCl and 500 |LiL cold (-20 ^C) ethanol, and held at - 
20 ""C for 1 hour. Following centrifugation, the supernatant was removed and the 
pellet was washed with 70% aqueous ethanol at -20 "^C. Each of the pellets was then 
dissolved in 150 ^xL of 150 mM sodium borate buffer, pH 9.4, 

Stock solutions comprising one each of building block precursors BBl to 
BB12, N,N-diisopropylethanolamine and 0-(7-azabenzotriazoH-yl)-14,3,3- 
tetramethyluronium hexafluorophosphate, each at a concentration of 0.25 M, were 
prepared in DMF and stirred at room temperature for 20 minutes. . The building 
block precursor solutions were added to each of the pellet solutions described above 
to provide a 10-fold excess of building block precursor relative to linker. The 
resulting solutions were stirred. An additional 10 equivalents of building block 
precursor was added to the reaction mixture after 20 minute, and another 10 
equivalents after 40 minutes. The final concentration of DMF in the reaction mixture 
was 22%. The reaction solutions were tiien stirred overnight at 4^C. The reaction 
progress was monitored by RP-HPLC using 50mM aqueous tetraethylammonium 
acetate (pH=7.5) and acetonitrile, and a gradient of 2-46% acetonitrile over 14 min. 
Reaction was stopped when -95% of starting material (Knker) is acylated. Following 
acylation the reaction mixtures were pooled and lyophilized to dryness. The 
lyophilized material was then purified by HPLC, and the firactions corresponding to 
the library (acylated product) were pooled and lyophilized. 

The library was dissolved in 2.5 ml of O.OIM sodium phosphate buffer (pH = 
8.2) and 0.1ml of piperidine (4% v/v) was added to it. The addition of piperidine 
results in turbidity which does not dissolve on mixing. The reaction mixtures were 
stirred at room temperature for 50 minutes, and then the turbid solution was 
centrifiiged (14,000 rpm), the supematant was removed using a 200 |li1 pipette, and the 
pellet was resuspended in 0.1 ml of water. The aqueous wash was combined with the 
supematant and the pellet was discarded. The deprotected library was precipitated 
from solution by addition of excess ice-cold ethanol so as to bring the final 
concentration of ethanol in the reaction to 70% v/v. Centriftigation of the aqueous 
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ethanol mixture gave a white pellet comprising the library. The pellet was washed 
once with cold 70% aq. ethanoL After removal of solvent the pellet was dried in air 
(--Smin.) to remove traces of ethanol and then used in cycle 2. The tags and 
corresponding building block precmrsors used in Round 1 are set forth in Table 1, 
5 below. 



Table 1 



Building 

Block 

Precursor 


Tag 


BBl 


1.11 


BB2 


1.6 


BBS 


1-2 


BB4 


1.8 


BB5 


1.1 


BB6 


1.10 


BB7 


1.12 


BBS 


1.5 


BB9 


1.4 


BBIO 


1.3 


BBll 


1.7 


BB12 


1.9 



10 Cycles 2-5 

For each of these cycles, the combined solution resulting from the previous 
cycle was divided into 12 equal aliquots of 50 ul each and placed in PGR tubes. To 
each tube was added a solution comprising a different tag, and ligation, purification 
and acylation were performed as described for Cycle 1, except that for Cycles 3-5, the 
15 . HPLC purification step described for Cycle 1 was omitted. The correspondence 
between tags and building block precursors for Cycles 2-5 is presented in Table 2. 

The products of Cycle 5 were ligated with the closing primer shown below, 
using the method described above for ligation of tags. 

20 

5' -PO3-GGCACATTGATTTGGGAGTCA 

GTGTAACTAAACCCTCAGT-PO3-5 
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Table 2 



Building 

Block 

Precursor 


Cycle 2 
Tag 


Cycle 3 
Tag 


Cycle 4 
Tag 


Cycle 5 
Tag 


BBl 


2.7 


3.7 


4.7 


5.7 


BB2 


2.8 


3.8 


4.8 


5.8 


BB3 


2.2 


3.2 


4.2 


5.2 


BB4 


2.10 


3.10 


4.10 


5.10 


BBS 


2.1 


3.1 


4.1 


5.1 


BB6 


2.12 


3.12 


4.12 


5.12 


BB7 


2.5 


3.5 


4.5 


5.5 


BBS 


2.6 


3.6 


4.6 


5.6 


BB9 


2.4 


3.4 


4.4 


5.4 


BBIO 


2.3 


3.3 


4.3 


5.3 


BBll 


2.9 


3.9 


4.9 


5.9 


BB12 


2.11 


3.11 


.4.11 


5.11 



Results: 

5 The synthetic procedure described above has the capabihty of producing a 

library comprising 12^ (about 249,000) different structures. The synthesis of the 
library was monitored via gel electrophoresis of the product of each cycle. The 
results of each of the five cycles and the final library following ligation of the closing 
primer are illustrated in Figure 7. The compound labeled "head piece'^ is Compound 
10 L The figure shows that each cycle results in the expected molecular weight increase 
and that the products of each cycle are substantially homogeneous with regard to 
molecular weight. 

Example 2: Synthesis and Characterisation of a library on the order of 10^ members 

15 

The synthesis of a library comprising on the order of 10^ distinct members was 
accomplished using the following reagents: 

Compound 2: 

20 
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Single letter codes for deoxyribonucleotides: 
5 A = adenosine 
C = cytidine 
G = guanosine 
T = thymidine 

10 

Building block precursors: 
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Fmoc 





CI 



OH '='~^-N-V°'' 



BB25 



Fmoc- 




O 



Fmbc OH 



BB26 



BB23 
Q 



Fmoc — N 
H 



NH; 




O 

If 

C 



OH 



BB27 



Fmoc-N' 




NH O 



BB29 



N 



O 

BB30 



BB24 



OH 




Fmoc 




BB31 



Fmoc-N 
H 



OH 



BB32 



H O 




Fmoc. 



NlH 



H HO 



HN 



Fmoc 

BB33 




BB34 



Fmoc' 



BB35 
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Fmoc ^ 




OoN 



Fnrioc> 




DH 



Fmoc O 



Fmoc ^ 



BB36 



BB37 



Fmoc . 



BB38 



Fmoc' 



BB40 



BB41 



Fmoc OH 

BB44 




Fmoc 



BB42 

FiTOC 



BB45 



BB46 



Frrac 



Fmoc. 




Fmoc 6 



BB48 



BB49 



BB50 



HN 

Fmoc O 



.OH 



Fmoc 




BB39 



Fmoc. 



OH 



BB43 



*^moc.^A^OH 
" I 

BB47 



Fmoc 




Fmoc 



W ° 



BB52 



BB53 



BB54 



BB55 



5 
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H2N 



" O HO O 



BB57 




BBSS 



Fmoc 



BB59 

.0. 



BB60 BB61 



S' 




^,A.^^OH y^OH ^^^^oH 
BB62 BB63 BB64 bB65 




" O 
BB66 




BB69 



^ O 




BB70 




5 



10 
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FmoC 



OH ^ 



c- 



Fmoc s. 



»^ O 

BB71 



OH 



N 

Fmoc o 



BB74 



OH 



BB72 



Fmoc 



Fmoc 



" O 

BB75 



BB78 



OH 



N 0 
I 11 
Frroc O 



BB81 



-OH 




BB780 

Fmoc 

N OH 



BB83 



OH 




Fmoc 



OH 



N 
Frtioc 

BB84 



BB85 




Frroc. 



OH 



O 



BB86 
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Table 3: Oligonucleotide tags used in cycle 1: 



Tag 

Number Top Strand Sequence 
5'-P03- 

AAATCGATGTGGTCACTCAG 

1.1 (SEQ ID NO: 121) 
5'-P03- 

AAATCGATGTGGACTAGGAG 

1.2 (SEQ ID NO: 123) 
5'-P03- 

AAATCGATGTGCCGTATGAG 

1.3 (SEQ ID NO: 125) 
5'-P03- 

AAATCGATGTGCTGAAGGAG 

1.4 (SEQ ID NO: 127) 
5'-P03- 

AAATCGATGTGGACTAGCAG 

1.5 (SEQ ID NO: 129) 
5'-P03- 

AAATCGATGTGCGCTAAGAG 

1.6 (SEQ ID NO: 131) 



Bottom Strand Sequence 
5'-P03- 

GAGTGACCACATCGATTTGG 
(SEQ ID NO: 122) 
5'-P03- 

CCTAGTCCACATCGATTTGG 
(SEQ ID NO: 124) 
5'-P03- 

CATACGGCACATCGATTTGG 
(SEQ ID NO: 126) 
5'-P03- 

CCTTCAGCACATCGATTTGG 
(SEQ ID NO: 128) 
5'-P03- 

GCTAGTCCACATCGATTTGG 
(SEQ ID NO:130) 
5'-P03- 

CTTAGCGCACATCGATTTGG 
(SEQ ID NO: 132) 
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5'-P03- 


5'-P03- 




AAATCGATGTGAGCCGAGAG 


CTCGGCTCACATCGATTTGG 


1.7 


(SEQ ID NO: 133) 


(SEQ ID NO: 134) 




5'-P03- 


5'-P03- 




AAATCGATGTGCCGTATCAG 


GATACGGCACATCGATTTGG 


1.8 


(SEQ ID NO:135) 


(SEQ ID NO: 136) 




5'-P03- 


5'-P03- 




AAATCGATGTGCTGAAGCAG 


GCTTCAGCACATCGATTTGG 


1.9 


(SEQ ID NO: 137) 


(SEQ ID NO: 138) 




5'-P03- 


5'-P03- 




AAATCGATGTGTGCGAGTAG 


ACTCGCACACATCGATTTGG 


1.10 


(SEQ ID NO: 139) 


(SEQ ID NO: 140) 




5'-P03- 


5'-P03- 




AAATCGATGTGTTTGGCGAG 


CGCCAAACACATCGATTTGG 


1.11 


(SEQ ID NO:141) 


(SEQ ID NO: 142) 




5'-P03- 


5'-P03- 




AAATCGATOTGCGCTAAQAG 


GTTAGCGCACATCGATTTGG 


1.12 


(SEQ ID NO: 143) 


(SEQ ID NO: 144) 




5'-P03- 


5'-P03- 




AAATCGATGTGAGCCGACAG 


GTCGGCrCACATCGATTTGG 


1.13 


(SEQ ID NO: 145) 


(SEQ ID NO:146> 




5'-P03- 


5'-P03- 




AAATCGATGTGAGCCGAAAG 


TTCGGCTCACATCGATTTGG 


1.14 


(SEQ ID NO: 147) 


(SEQ ID NO: 148) 




5'-P03- 


5'-P03- 




AAATCGATGTGTCGGTAGAG 


CTACCGACACATCGATTTGG 


1.15 


(SEQ ID NO: 149) 


(SEQ ID NO: 150) 




5'-P03- 


5'-P03- 




AAATCGATGTGGTTGCCGAG 


CGGCAACCACATCGATTTGG 


1.16 


(SEQ ID NO: 151)' 


(SEQ ID NO: 152) 




5'-P03- 


5'-P03- 




AAATCGATGTGAGTGCGTAG 


ACGCACTCACATCGATTTGG 


1.17 


(SEQ ID NO: 153) 


(SEQ ID NO:154) 




5'-P03- 


5'-P03- 




AAATCGATGTGGTTGCCAAG 


TGGCAACCACATCGATTTGG 


1.18 


(SEQ ID NO: 155) 


(SEQ ID NO: 156) 




5'-P03- 


5'-P03- 




AAATCGATGTGTGCGAGGAG 


CCTCGCACACATCGATTTGG 


1.19 


(SEQ ID NO:157) 


(SEQ ID NO:158) 




5'-P03- 


5'-P03- 




AAATCGATGTGGAACACGAG 


CGTGTTCCACATCGATTTGG 


1.20 


(SEQ ID NO:159) 


(SEQ ID NO: 160) 




5'-P03- 


5'-P03- 




AAATCGATGTGCTTGTCGAG 


CGACAAGCACATCGATTTGG 


1.21 


(SEQ ID-NO:161> 


(SEQ ID NO: 162) 




5'-P03- 


5'-P03- 




AAATCGATGTGTTCCGGTAG 


AOCCGGAACACATCGATTTGG 


1.22 


(SEQ ID NO: 153) 


(SEQ ID NO: 164) 




5'-P03- 


5'-P03- 




AAATCGATGTGTGCGAGCAG 


GCTCGCACACATCGATTTGG 


1.23 


(SEQ ID NO: 165) 


(SEQ ID NO: 166) 




5'-P03- 


5'-P03- 


1.24 


AAATCGATGTGGTCAGGTAG 


ACCTGACCACATCGATTTGG 
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5 -rU3- 


5'-P03- 




AAA i CVjA lijr 1 vjvjL/CTG 11 AG 


AACAGGCCACATCGATTTGG 






(SEQ ID NO: 170) 






5'-P03- 




AAA T*/^/^ A Tni^T*/^/^ A A A ^^^^ it 

AAA 1 CoA 1 G 1 vjGAAC ACC AG 


GGTGTTCCACATCGATTTGG 




(SEQ ID NO:i/l) 


(SEQ ID NO: 172) 






5'-P03- 




D -lr(J3-AAAlCOATGTGCTTGTCCAG 


GGACAAGCACATCGATTTGG 


1.27 


(SEQ ID NO:173) 


(SEQ ID NO: 17,4) 




5'-P03- 


5'-P03- 




AAATCGATGTGTGCGAGAAG 


TCTCGCACACATCGATTTGG 


1.28 


(SEQ ID NO: 175) 


(SEQ ID NO:176) 




5'-P03- 


5'-P03- 




AAATCGATGTGAGTGCGGAG 


CCGCACTCACATCGATTTGG 


1.29 


(SEQ ID NO: 177) 


(SEQ ID NO:178) 




5'-P03- 


5'-P03- 




AAA 1 CGATGTG rTGTCCGAG 


CGGACAACACATCGATTTGG 


1.30 


(SEQ ID NO: 179) 


(SEQ ID NO: 180) 




5'-P03- 


5'-P03- 




AAA 1 CGATGTGTGGAACGAG 


CGTTCCACACATCGATTTGG 


1 o 1 
1.31 


(SEQ ID NO:.181) 


(SEQ ID NO: 182) 




5'-P03- 


5'-P03- 




AAATCGATGTGAGTGCGAAG 


TCGCACTCACATCGATTTGG 


1.32 


(SEQ ID NO:183) 


(SEQ ID NO: 184) 




5'-P03- 


5'-P03- 




AAA 1 CGA rGTGTGGAACCAG 


GGTTCCACACATCGATTTGG 


1.33 


(SEQ ID NO: 185) 


(SEQ ID NO: 186) 




5'-P03- 


5'-P03- 




AAATCGATGTGTTAGGCGAG 


CGCCTAACACATCGATTTGG 


1 .34 


(SEQ ID NO: 187) 


(SEQ ID NO:188) 




5 -P03- 


5'-P03- 




AAA 1 CGA 1 G 1 GGCCTGTGAG 


CACAGGCCACATCGATTTGG 




(bEQ ID NO:loy) 


(SEQ ID NO: 190) 






5'"P03- 




D -Jr (J J -AAA i CGA 1 G 1 GC iCCTG T AG 


ACAGGAGCACATCGATTTGG 


1 1/^ 


voUjy ID iMu:iy-i) 


(SEQ ID NO:192) 




5'-P03"- 


5'-P03- 




AAATCGATGTGGTCAGGCAG 


GCCTGACCACATCGATTTGG 


1.37 


(SEQ ID NO: 193) 


(SEQ ID NO: 194) 




5'--P03- 


5'-P03- 




AAA TP/^/^ A 'T'/~<nP/^/^'T'/^ A /^/^ A A >^ 

AAAl CGATGTGGTCAGGAAG 


TCCTGACCACATCGATTTGG 


1.38 


(SEQ ID NO:195) 


(SEQ ID NO:196) 




5'-P03- 


5'-P03- 




AAATCGATGTGGTAGCCGAG 


CGGCTACCACATCGATTTGG 


1 '^Q 




(SEQ ID NO:198) 




5'-P03- 


5'-P03- 




AAATCGATGTGGCCTGTAAG 


TACAGGCCACATCGATTTGG 


1.40 


(SEQ ID NO: 199) 


(SEQ ID NO:200) 




5'-P03- 


5'-P03- 




AAATCGATGTGCTTTCGGAG 


CCGAAAGCACATCGATTTGG 


1.41 


(SEQ ID NO:201) 


(SEQ ID NO:202) 



-84- 



wo 2007/053358 



PCT/US2006/041356 



5'-P03- 

AAATCGATGTGCGTAAGGAG 

1.42 (SEQ ID NO: 203) 
5'-P03- 

AAATCGATGTGAGAGCGTAG 

1.43 (SEQ ID NO: 205) 
5'-P03- 

AAATCGATGTGGACQGCAAG 

1.44 (SEQ ID NO: 207) 

5'-P03-AAATCQATGTGCnTCGCAG 

1.45 (SEQ ID NO: 209) 
5'-P03- 

AAATCGATGTGCGTAAGCAG 

1.46 (SEQ ID NO: 211) 
5'-P03- 

AAATCGATGTGGCTATGGAG 

1.47 (SEQ ID NO: 213) 
5'-P03- 

AAATCGATGTGACTCTGGAG 

1.48 {SEQ ID NO:215) 



5'-P03- 

CCrTACGCACATCGATTTGG 
(SEQ ID NO: 204) 
5'-P03- 

ACGCTCTCACATCGATTrGG 
(SEQ ID NO:206) 
5'-P03- 

TGCCGTCCACATCGATTTGG 
(SEQ ID NO:208) 
5'-P03- 

GCGAAAGCACATCGATTTGG 
(SEQ ID N0:210) 
5'-P03- 

GCTTACGCACATCX3ATTTGG 
(SEQ ID N0:212) 
5'-P03- 

CCATAGCCACATCGATTTGG 
(SEQ ID NO:214) 
5'-P03- 

CCAGAGTCACATCGATTTGG 
(SEQ ID NO: 216) 



5'-P03-AAATCGATGTGCTGGAAAG 

1.49 (SEQ ID NO: 217) 
5'-P03- 

AAATCGATGTGCCGAAGTAG 

1.50 (SEQ ID NO: 219) 
5'-P03- 

AAATCGATGTGCTCCTGAAG 

1.51 (SEQ ID NO: 221) 
5^-P03- 

AAATCGATGTGTCCAGTCAQ 

1.52 (SEQ ID NO: 223) 
5'-P03- 

AAATCGATGTGAGAGCGGAG 

1.53 (SEQ ID NO: 225) 
5'-P03- 

AAATCGATGTGAGAGCGAAG 

1.54 (SEQ ID NO: 227) 
5'-P03- 

AAATCGATGTGCCGAAGGAG 

1.55 (SEQ ID NO: 229) 
5'-P03- 

AAATCGATGTGCCGAAGCAG 

1.56 (SEQ ID NO: 231) 
5'-P03- 

AAATCGATGTGTGTTCCGAG 

1.57 (SEQ ID NO: 233) 
5'-P03- 

AAATCGATGTGTCTGGCGAG 

1.58 (SEQ ID NO: 235) 
5'-P03- 

1.59 AAATCGATGTGCTATCGGAG 



5'-P03- 

TTCCAGCACATCGATTTGG 
(SEQ ID NO:218) 
5'-P03- 

ACTTCGGCACATCGATTTGG 
(SEQ ID NO:220) 
5'-P03- 

TCAGGAGCACATCGATTTGG 
(SEQ ID NO:222) 
5'-P03- 

GACTGGACACATCGATTTGG 
(SEQ ID NO: 224) 
5'-P03- 

CCGCTCTCACATCGATTTGG 
(SEQ ID NO:226) 
5'-P03- 

TCGCTCTCACATCGATTTGG 
(SEQ ID NO: 228) 
5'-P03- 

CCTTCGGCACATCGATTTGG 
(SEQ ID NO:230) 
5'-P03- 

GCTTCGGCACATCGATTTGG 
(SEQ ID NO:232) 
5'-P03- 

CGGAACACACATCGATTTGG 
(SEQ ID NO: 234) 
5'-P03- 

CGCCAGACACATCGATTTGG 
(SEQ ID NO:236) 
5'-P03- 

CCGATAGCACATCGATTTGG 
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(SEQ ID NO:237) 


(SEQ ID NO:238) 




5'-P03- 


5'-P03- 




AAATCGATGTGCGAAAGGAG 


CCTTTCGCACATCGATTTGG 


1.60 


(SEQ ID NO:239) 


(SEQ ID NO:240) 




5'-P03- 


5'-P03- 




AAATCGATGTGCCGAAGAAG 


TCTTCGGCACATCGATTTGG 


1.61 


(SEQ ID NO:241) 


(SEQ ID NO:242) 




5'-P03- 


5'-P03- 




AAATCGATGTGGTTGCAGAG 


CTGCAACCACATCGATTTGG 


1.62 


(SEQ ID NO:243) • 


(SEQ ID NO: 244) 




5'-P03- 


5'-P03- 




AAATCGATGTGGATGGTGAG 


CACCATCCACATCGATTTGG 


1.63 


(SEQ ID NO:245) 


(SEQ ID NO:246) 




5'-P03- 


5'-P03- 




AAATCGATGTGCTATCGCAG 


GCGATAGCACATCGATTTGG 


1.64 


(SEQ ID NO:247) 


(SEQ ID NO:248) 




5'-P03- 


5'-P03- 




AAATCGATGTGCGAAAGCAG 


GC'iri CGCACATCGATTTGG 


1.65 


(SEQ ID NO:249) 


(SEg ID NO:250) 




5'-P03- 


5'-P03- 




AAATCGATGTGACACTGGAG 


CCAGTGTCACATCGATTTGG 


1.66 


(SEQ ID Na:251) 


(SEQ ID NO: 252) 




5'-P03- 


5'-P03- 




AAATCGATGTGTCTGGCAAG 


TGCCAGACACATCGATTTGG 


1.67 


(SEQ ID NO:253) 


(SEQ ID NO:254) 




5'-P03- 


5'-P03- 




AAATCGATGTGGATGGTCAG . 


GACCATCCACATCGATTTGG 


1.68 


(SEQ ID NO:255) 


(SEQ ID NO:256) 




5'-P03- 


5'-P03- 




AAATCGATGTGGTTGCACAG 


GTGCAAGCACATCGATTTGG 


1.69 


(SEQ ID NO: 257) 


(SEQ ID NO: 258) 




5'-P03- 


5'-P03-CGATGCCCCATCCGA 




AAATCGATGTGGGCATCGAG 


TTTGG 


1.70 


(SEQ ID NO:259) 


(SEQ ID NO:26Q) 




5'-P03- 


5'-P03- 




AAATCGATGTGTGCCTCCAG 


GGAGGCACACATCGATTTGG 


1.71 


(SEQ ID NO:261) 


(SEQ ID NO:262) 




5'-P03- 


5'-P03- 




AAATCGATGTGTGCCTCAAG 


TGAGGCACACATCGATTTGG 


1.72 


(SEQ ID NO:263) 


(SEQ ID NO:264) 




5'-P03- 


5'-P03- 




AAATCGATGTGGGCATCCAG 


GGATGCCCACATCGATTTGG 


1.73 


(SEQ ID NO:265) 


(SEQ ID NO:266) 




5'-P03- 


5'-P03-TGATGCCCA CAT CGA 




AAATCGATGTGGGCATCAAG 


TTTGG 


1.74 


(SEQ ID NO:267) 


(SEQ ID NO:268) 




5'-P03- 


5'-P03-CGA GAG GCA CAT 




AAATCGATGTGCCTGTCGAG 


CGA TTT GG 


1.75 


(SEQ ID NO:269) 


(SEQ ID NO:270) 




5'-P03- 


5'-P03-ATC CGT CCA CAT 




AAATCGATGTGGACGGATAG 


CGA TTTGG 


1.76 


(SEQ ID NO:271) 


(SEQ ID NO:272) 
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5'-P03- 

AAATCGATGTGCCTGTCCAG 

1.77 (SEQ ID NO: 273) 
5'-P03- 

AAATCGATGTGAAGCACGAG 

1.78 (SEQ ID NO:275) 
5'-P03- 

AAATCGATGTGCCTGTCAAG 

1.79 (SEQ ID NO: 277) 
5'-P03- 

AAATCGATGTGAAGCACCAG 

1.80 (SEQ ID NO: 279) 

5'-P03-AAATCGATGTGCCTTCGTAG 

1.81 (SEQ ID NO:281) 
5'-P03- 

AAATCGATGTGTCGTCCGAG 

1.82 (SEQ ID NO:283) 
5'-P03- 

AAATCGATGTGGAGTCTGAG 

1.83 (SEQ ID NO:285> 
5'-P03- 

AAATCGATGTGTGATCCGAG 
1.84 (SEQ ID NO:287) 



5'-P03-GGA CAG GCA CAT 
CGATTTGG 

(SEQ ID NO:274) 
5'-P03-CGT GCT TCA CAT 
CGATTTGG 

(SEQ ID NO:276) 
5'-P03-TGA CAG GCA CAT 
CGATTTGG 

(SEQ ID NO:278) 
5'-P03-GGT GCT TCA CAT 
CGATTTGG 

(SEQ ID NO:280> 
5'-P03-ACG AAG GCA CAT 
CGATTTGG 

(SEQ ID NO:282) 
5'-P03-CGG ACG ACA CAT 
CGATTTGG 

(SEQ- ID- NO: 284) 
5'-P03-CAG ACT CCA CAT 
CGATTTGG 

(SEQ ID NO:286) 
5'-P03-CGG ATC ACA CAT 
CGATTTGG 

(SEQ ID NO: 288) 



5'-P03- 

AAATCGATGTGTCAGGCGAG 

1.85 (SEQ ID NO:289) 
5'-P03- 

AAATCGATGTGTCGTCCAAG 

1.86 (SEQ ID NO: 291) 
5'-P03- 

AAATCGATGTGGACGGAGAG 

1.87 (SEQ ID NO: 293) 
5'-P03- 

AAATCGATGTGGTAGCAGAG 

1.88 (SEQ ID NO: 295) 
5'-P03- 

AAATCGATGTGGCTGTGTAG 

1.89 (SEQ ID NO: 297) 
5'-P03- 

AAATCGATGTGGACGGACAG 

1.90 (SEQ ID NO: 299) 
5'-P03- 

AAATCGATGTGTCAGGCAAG 

1.91 (SEQ ID NO: 301) 
5'-P03- 

AAATCGATGTGGCTCGAAAG 

1.92 (SEQ ID NO: 303) 
5'-P03- 

AAATCGATGTGCCTTCGGAG 

1.93 (SEQ ID NO: 305) 
5'-P03- 

1.94 AAATCGATGTGGTAGCACAG 



5'-P03-CGC CTG ACA CAT 
CGATTTGG 
(SEQ ID NO: 290) 
5'-P03-TGG ACG ACA CAT 
CGATTTGG 
(SEQ ID NO: 292) 
5'-P03-CTC CGT CCA CAT 
CGATTTGG 
(SEQ ID NO:294) 
5'-P03-CTG CTA CCA CAT 
CGATTTGG 
(SEQ ID NO: 296) 
5'-P03- 

ACACAGCCACATCGATTTGQ 

(SEQ ID NO: 298) 
5'-P03-GTC CGT CCA CAT 
CGATTTGG 

(SEQ ID NO: 300) 
5'-P03-TGC CTG ACA CAT 
CGATTTGG 

(SEQ ID NO: 302) 

5'-P03- 

TTCGAGCCACATCGATTTGG 

(SEQ ID NO: 304) 
5'-P03-CCG AAG GCA CAT 
CGATTTGG 

(SEQ ID NO: 306) 
5'-P03-GTG CTA CCA CAT 
CGATTTGG 
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(SEQ ID NO:307) 
5'-P03- 

AAATCGATGTGGAAGGTCAG 
(SEQ ID NO:309) 
5'-P03- 

AAATCGATGTGGTGCTGTAG 
(SEQ ID N0:311) 



(SEQ ID NO:308) 

5'-P03-GAC CTT CCA CAT 
CGATTTGG 
(SEQ ID NO:310) 
5'-P03-ACA GCA CCA CAT 
CGA TTT GG 
(SEQ ID NO:312) 



Table 4: Oligonucleotide tags used in cycle 2: 



Tag 

Number 

2.1 

2.2 

2.3 

2.4 

2.5 

2.6 

2.7 

2.8 

2.9 

2.10 

2.11 

2.12 

2.13 

2.14 

2.15 

2.16 

2.17 

2.18 

2.19 



Top strand sequence 



Bottom strand sequence 



5'-P03-GTT GCC TGT 

(SEQ ID NO:313) 
5'-P03-CAG GAC GGT 

(SEQ ID NO:315) 
5'-P03-AGA CGT GGT 

(SEQ ID NO:317) 
5'-P03-CAG GAC CGT 

(SEQ ID NO:319) 
5 '-P03-CAG GAC AGT 

(SEQ ID NO: 321) 
5'-P03-CAC TCT GGT 

(SEQ ID NO: 323) 
5'-P03-GAC GGC TGT 

(SEQ ID NO: 325) 
5'-P03-CACTCTCGT 

(SEQ ID NO: 327) 
5'-P03-GTA GCC TGT 

(SEQ ID NO: 32 9) 
5 '^P03-GCC ACT TGT 

(SEQ ID NO: 331) 
5'-P03-CAT CGC TGT 

(SEQ ID NO: 333) 
5'-P03-CAC TGG TGT 

(SEQ ID NO:335) 



5'-P03-AGG CAA CCT 

(SEQ ID NO:314) 
5'-P03-CGT CCT GCT 

(SEQ ID NO: 316) 
5'-P03-CAC GTC TCT 

(SEQ ID NO:318) 
5'-P03-GGT CCT GCT 

(SEQ ID NO: 320) 
5'-P03-TGT CCT GCT 

(SEQ ID NO:322) 
5'-P03-CAG AGT GCT 

(SEQ ID NO:324) 
5'-P03-AGC CGT CCT 

(SEQ ID NO:326) 
5'-P03-GAG AGT GCT 

(SEQ ID NO: 328) 
5'-P03-AGG CTA CCT 

(SEQ ID NO:330) 
5'-P03-AAG TGG CCT 

(SEQ ID NO:332) 
5'-P03-AGC GAT GCT 

(SEQ ID NO: 334) 
5'-P03-ACC AGT GCT 

(SEQ ID NO: 336) 



5'-P03-GCC ACT GGT 

(SEQ ID NO:337) 
5'-P03-TCT GGC TGT 

(SEQ ID NO: 339) 
5'-P03-GCC ACT CGT 

(SEQ ID NO:341) 
5'-P03-TGC CTC TGT 

(SEQ ID NO:343) 
5'-P03-CAT CGC AGT 

(SEQ ID NO: 345) 
5'-P03-CAG GAA GGT 

(SEQ ID NO: 347) 
5'-P03-GGC ATC TGT 

(SEQ ID NO: 349) 



5'-P03-CAG TGG CCT 

(SEQ ID NO:338) 
5'-P03-AGC CAG ACT 

(SEQ ID NO:340) 
5'-P03-GAG TGG CCT 

(SEQ ID NO:342) 
5'-P03-AGA GGC ACT 

(SEQ ID NO:344) 
5'-P03-TGC GAT GCT 

(SEQ ID NO:346) 
5'-P03-CTT CCT GCT 

(SEQ ID NO:348) 
5'-P03-AGA TGC CCT 

(SEQ ID NO: 350) 
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2.20 
2.21 
2.22 
2.23 
2.24 
2.25 
2.26 
2.27 
2.28 
2.29 
2.30 
2.31 
2.32 
2.33 
2.34 
2.35 
2.36 
2.37 
2.38 
2.39 
2.40 
2.41 
2.42 
2.43 
2.44 
2.45 



5 '-P03-CGG TGC TOT 5 '-P03 

(SEQ ID NO: 351) (SEQ 

5'-P03-CACTGGCGT 5'-P03 

(SEQ ID NO: 353) (SEQ 

5'-P03-TCTCCTCGT 5'-P03 

(SEQ ID NO:355) (SEQ 

5'-P03-CCT GTC TGT 5'-P03 

(SEQ ID NO: 357) (SEQ 

5 '-P03 -CAA CGC TGT 5 ' -P03 

(SEQ ID NO:359) (SEQ 



-AGC ACC GCT 
ID NO:352) 
GCC AGT GCT 
ID NO:354) 
GAGGAGACT 
ID NO:356) 
-AGA CAG GCT 
ID NO:358) 
-AGC GTT GCT 
ID NO:360) 



5 '-P03-TGC CTC GGT 5 '-P03 

(SEQ ID NO:361) (SEQ 

5'-P03-ACACTGCGT 5'-P03 

(SEQ ID NO: 363) (SEQ 

5 '-P03-TCG TCC TGT 5 '-P03 

(SEQ ID NO: 365) (SEQ 

5 '-P03-GCT GCC AGT 5'-P03 

(SEQ ID NO:367) (SEQ 

5'-P03-TCA GGC TGT 5'-P03 

(SEQ ID NO: 369) (SEQ 

5 '-P03-GCC AGG TGT 5'-P03 

(SEQ ID NO:371) (SEQ 

5'-P03-CGG ACC TGT 5'-P03 

(SEQ ID NO: 373)- (SEQ 

5'-P03-CAA CGC AGT 5'-P03 

(SEQ ID NO:375) (SEQ 

5'-P03-CACACGAGT 5'-P03 

(SEQ ID NO: 377) (SEQ 

5'-P03-ATG GCC TGT 5'-P03 

(SEQ ID NO:379) (SEQ 

5 '-P03-CCA GTC TGT 5'-P03 

(SEQ ID NO: 381 > (SEQ 

5'-P03-GCC AGG AGT 5'-P03 

(SEQ ID NO:383) (SEQ 



CGAGGCACT 
ID NO:362) 
GCA GTG TCT 
ID NO:364) 
-AGGACGACT 
ID NO: 366) 
TGG CAG CCT 
ID NO:368) 
AGC CTG ACT 
ID NO: 370) 
ACC TGG CCT 
ID NO: 372) 
-AGG TCC GCT 
ID NO:374) 
TGC GTT GCT 
ID NO: 376) 
TCG TGT GCT 
ID NO:378) 
AGG CCA TCT 
ID NO:380) 
AGA CTG GCT 
ID NO:382) 
TCC TGG CCT 
ID NO: 38 4) 



5'-P03-CGG ACC AGT 5'-P03 

(SEQ ID NO:385) (SEQ 

5 '-P03-CCT TCG CGT 5 '-P03 

(SEQ ID NO: 387) (SEQ 

5'-P03-GCA GCC AGT 5'-P03 

(SEQ ID NO:389) (SEQ 

5'-P03-CCAGTCGGT 5'-P03 

(SEQ ID NO: 391) (SEQ 

5'-P03-ACTGAG'CGT 5'-P03 

(SEQ ID NO: 393) (SEQ 

5 '-P03-CCA GTC CGT 5 '-P03- 

(SEQ ID NO:395) (SEQ 

5 '-P03-CCA GTC AGT 5 '-P03- 

(SEQ ID NO: 397) (SEQ 

5'-P03-CAT CGA GGT 5'-P03- 

(SEQ ID NO:399) (SEQ 

5'-P03-CCA TCG TGT 5'-P03- 

(SEQ ID NO:401)_ (SEQ 



•TGG TCC GCT 
ID NO:386) 
-GCGAAGGCT 
ID NO:388) 
•TGG CTG CCT 
ID NO:390) 
CGA CTG GCT 
ID NO:392) 
•GCT CAG TCT 
ID NO:394) 
GGA CTG GCT 
ID NO:396) 
•TGACTGGCT 
ID NO:398) 
CTC GAT GCT 
ID NO:400) 
ACGATGGCT 
ID NO:402) 
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2.46 
2.47 
2.48 
2.49 
2.50 
2.51 
2.52 
2.53 
2.54 
2.55 
2.56 
2.57 
2.58 
.2.59 
2.60 
2.61 
2.62 
2.63 
2.64 
2.65 
2.66 
2.67 
2.68 
2.69 
2.70 
2.71 



5'-P03-GTG CTG CGT 

(SEQ ID NO:403) 
5'-P03-GAC TAG GGT 
(SEQ ID NO: 405) 
5'-P03-GTG CTG AGT 
(SEQ ID NO: 407) 



5'-P03-GCA GCA CGT 

(SEQ ID NO:404) 
5'-P03-CGT AGT COT 
(SEQ ID NO:406) 
5'-P03-TCAGCACCT 
(SEQ ID NO: 408) 



5 '-P03-GCTGCATGT 

(SEQ ID NO: 409) 
5 ' -P03 -GAGTGGTGT 

(SEQ ID NO: 411) 
5'-P03-GACTACCGT 

(SEQ ID NO:413) 
5'-P03-CGGTGATGT 

(SEQ ID NO:415) 
5 '-P03-TGCGACTGT 

(SEQ ID NO: 417) 
5 '-P03-TCTGGAGGT 

(SEQ ID NO:419) 
5'-P03-AGCACTGGT 

(SEQ ID NO: 421) 
5 '-P03-TCGCTTGGT 

(SEQ ID NO:423) 
5'-P03-AGCACTCGT 

(SEQ ID NO:425) 
5'-P03-GCGATTGGT 

(SEQ ID NO:427) 
5 '-P03-CCATCGCGT 

(SEQ ID NO:429) 
5'-P03-TCGCTTCGT 

(SEQ ID NO; 431) 



5 '-P03-AGTGCCTGT 

(SEQ ID NO:433) 
5'-P03-GGCATAGGT 

(SEQ ID NO:435) 
5'-P03-GCGATTCGT 

(SEQ ID NO:437) 
5 ' -P03 -TGCGACGGT 

(SEQ ID NO:439) 
5'-P03-GAGTGGCGT 

(SEQ ID NO:441) 
5'-P03-CGGTGAGGT 

(SEQ ID NO: 443) 
5'-P03-GCTGCAAGT 

(SEQ ID NO:445) 
5 '-P03-TTCCGCTGT 

(SEQ ID NO:447) 
5 ' -P03 -GAGTGGAGT 

(SEQ ID NO:449) 
5 '-P03-ACAGAGCGT 

(SEQ ID NO:451) 
5'-P03-TGCGACCGT 

(SEQ ID NO:453) 



5 '-P03-ATGCAGCCT 

(SEQ ID NO:410) 
5 '-P03-ACCACTCCT 

(SEQ ID NO: 412) 
5 '-P03-GGTAGTCCT 

(SEQ ID NO:414) 
5 '-P03-ATCACCGCT 

(SEQ ID NO: 416) 
5 '-P03-AGTCGCACT 

(SEQ ID NO: 418) 
5 '-P03-CTCCAGACT 

(SEQ ID NO:420) 
5 '-P03-CAGTGCTCT 

(SEQ ID NO: 422) 
5 '-P03-CAAGCGACT 

(SEQ ID NO: 424) 
5'-P03-GAGTGCTCT 

(SEQ ID NO: 426) 
5 '-P03-CAATCGCCT 

(SEQ ID NO:428) 
5 '-P03-GCGATGGCT 

(SEQ ID NO:430) 
5 '-P03-GAAGCGACT 

(SEQ ID NO: 432) 



5 '-P03-AGGCACTCT 

(SEQ ID NO:434) 
5 '-P03-CTATGCCCT 

(SEQ ID NO: 436) 
5'-P03-GAATCGCCT 

(SEQ ID NO: 438) 
5'-P03-CGTCGCACT 

(SEQ ID NO:440) 
5 '-P03-GCCACTCCT 

(SEQ ID NO:442) 
5 '-P03-CTCACCGCT 

(SEQ ID NO: 444) 
5 '-P03-TTGCAGCCT 

(SEQ ID NO:446) 
5 '-P03-AGCGGAACT 

(SEQ ID NO:448) 
5 ' -P03-TCCACTCCT 

(SEQ ID NO: 450) 
5 '-P03-GCTCTGTCT 

(SEQ ID NO:452) 
5 '-P03-GGTCGCACT 

(SEQ ID NO: 454) 
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5 '-P03-CCTGTAGGT 


5'-P03-CTACAGGCT 


2.72 


(SEQ ID NO: 455) 


(SEQ ID NO:456) 




5'-P03-TAGCCGTGT 


5 '-P03-ACGGCTACT 


2.73 


(SEQ ID NO:457) 


(SEQ ID NO: 458) 




5'-P03-TGCGACAGT 


5 '-P03-TGTCGCACT 


2.74 


(SEQ ID NO: 459) 


(SEQ ID NO:460) 




5 ' -P03 -GGTCTGTGT 


5 '-P03-ACAGACCCT 


2.75 


(SEQ ID NO: 461) 


(SEQ ID NO: 462) 




5'-P03-CGGTGAAGT 


5 '-P03-TTCACCGCT 


2.76 


(SEQ ID NO: 463) 


(SEQ ID NO: 4 64) 




5 '-P03-CAACGAGGT 


5 '-P63-CTCGTTGCT 


mi 


(SEQ ID NO: 465) 


(SEQ ID NO:466) 




5'-P03-GCAGCATGT 


5'-P03-ATGCTGCCT 


2.78 


(SEQ ID NO:467) 


(SEQ ID NO: 468) 




5 '-P03-TCGTCAGGT 


5 '-P03-CTGACGACT 


2.79 


(SEQ ID NO: 469) 


(SEQ ID NO: 470) 




5'-P03-AGTGCCAGT 


5 '-P03-TGGCACTCT 


2.80 


(SEQ ID NO: 471) 


(SEQ ID NO:472) 




5'-P03-TAGAGGCGT 


5'-P03-GCCTCTACT 


2.81 


(SEQ ID NO: 473) 


(SEQ ID NO: 47 4) 




5'-P03-GTCAGCGGT 


5'-P03-CGCTGACCT 


2.82 


(SEQ ID NO:475) 


(SEQ ID NO:476) 




5'-P03-TCAGGAGGT 


5 '-P03-CTCCTGACT 


2.83 


(SEQ ID NO:477) 


(SEQ ID NO: 478) 




5'-P03-AGCAGGTGT 


5 '-P03-ACCTGCTCT 


2,84 


(SEQ ID NO: 479 


(SEQ ID NO: 480) 




5 ' -P03 -TTCCGCAGT 


5 '-P03-TGCGGAACT 


2.85 


(SEQ ID NO:481) 


(SEQ ID NO;482) 




5 '-P03 -GTCAGCCGT 


5 '-P03-GGCTGACCT 


2.86 


(SEQ ID NO:483) " 


(SEQ ID NO: 484) 




5'-P03-GGTCTGCGT 


5'-P03-GCAGACCCT 


2.87 


(SEQ ID NO: 485) 


(SEQ ID NO:486) 




5'-P03-TAGCCGAGT 


5'-P03-TCGGCTACT 


2.88 


(SEQ ID NO:487> 


(SEQ ID NO: 488) 




5 ' -P03 -GTCAGC AGT 


5 '-P03-TGCTGACCT 


2.89 


(SEQ ID NO:489) 


(SEQ ID NO: 4 90) 




5'-P03-GGTCTGAGT 


5'-P03-TCAGACCCT 


2.90 


(SEQ ID NO: 491) 


(SEQ ID NO:492) 




5 '-P03-CGGACAGGT 


5 '-P03-CTGTCCGCT 


2.91 


(SEQ ID NO: 4 93) 


(SEQ ID NO:494) 




5'-P03-TTAGCCGGT5'- 


5'-P03-CGGCTAACT5'-P03- 




P03-3' 


3' 


2.92 


(SEQ ID NO:495) 


(SEQ ID NO: 496) 




5'-P03-GAGACGAGT 


5 '-P03-TCGTCTCCT 


2.93 


(SEQ ID NO:497) 


(SEQ ID NO:498) 




5 '-P03-CGTAACCGT 


5'-P03-GGTTACGCT 


2.94 


(SEQ ID NO: 499) 


(SEQ ID NO:500) 




5'-P03-TTGGCGTGT5'- 


5'-P03-ACGCCAACT5'-P03- 




P03-3' 


3' 


2.95 


(SEQ ID NO:501) 


(SEQ ID NO:502) 




5'-P03-ATGGCAGGT 


5 '-P03-CTGCCATCT 


2.96 


(SEQ ID NO: 503) 


(SEQ ID NO:504) 
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Table 5. Oligonucleotide tags used in cycle 3 



Tag 

number 



Top strand sequence 



Bottom strand 
sequence 



5'-P03-CAG CTA CGA 

3.1 (SEQ ID NO:505) 
5'-P03-CTC CTG CGA 

3.2 (SEQ ID NO: 507) 
5'-P03-GCTGCCTGA 

3.3 (SEQ ID NO: 509) 
5'-P03-CAG GAA CGA 

3.4 (SEQ ID NO: 511) 
5'-P03-CAC ACG CGA 

3.5 (SEQ ID NO: 513) 
5'-P03-GCA GCC TGA 

3.6 (SEQ ID NO: 515) 
5'-P03-CTG AAC GGA 

3.7 (SEQ ID NO: 517) 
5'-P03-CTG AAC CGA 

3.8 (SEQ ID NO: 519) 
5'-P03-TCT GGA CGA 

3.9 (SEQ ID NO: 521) 
5'-P03-TGC CTA CGA 

3.10 (SEQ ID NO: 523) 
5'-P03-GGC ATA CGA 

3.11 (SEQ ID NO: 525) 
5'-P03-CGG TGA CGA 

3.12 (SEQ ID NO: 527) 



5'-P03-GTAGCTGAC 

(SEQ ID NO:506) 
5'-P03-GCA GGA GAC 

(SEQ ID NO: 508) 
5'-P03-AGG CAG CAC 

(SEQ ID NO: 510) 
5'-P03-GTT CCT GAC 

(SEQ ID NO: 512) 
5'-P03-GCG TOT GAC 

(SEQ ID NO:514) 
5'-P03-AGG CTG CAC 

(SEQ ID NO: 516) 
5'-P03-CGT TCA GAC 

(SEQ ID NO:518) 
5'-P03-GGT TCA GAC 

(SEQ ID NO:520) 
5'-P03-GTC CAG AAC 

(SEQ ID NO: 522) 
5'-P03-GTA GGC AAC 

(SEQ ID NO: 524) 
5'-P03-GTA TGC CAC 

(SEQ ID NO: 52 6) 
5'-P03-GTC ACC GAC 

(SEQ ID NO:528) 



5 '-P03-CAA CGA CGA 

3.13 (SEQ ID NO: 529) 
5'-POS-CTC CTC TGA 

3.14 (SEQ ID NO: 531) . 
5 '-P03-TCA GGA CGA 

3.15 (SEQ ID NO: 533) 
5'-P03-AAA GGC GGA 

3.16 (SEQ ID NO: 535) 
5'-P03-CTC CTC GGA 

3.17 (SEQ ID NO: 537) 
5'-P03-CAG ATG CGA 

3.18 (SEQ ID NO: 539) 
5'-P03-GCA GCA AGA 

3.19 (SEQ ID NO: 541) 
5'-P03-GTG GAG TGA 

3.20 (SEQ ID NO: 543) 
5 '-P03-CCA GTA GGA 

3.21 (SEQ ID NO: 545) 
5'-P03-ATG GCA CGA 

3.22 (SEQ ID NO: 547) 



5'-P03-GTC GTT GAC 

(SEQ ID NO: 530) 
5 ' -P03-AGA GGA GAC 

(SEQ ID NO:532) 
5 ' -P03 -GTC CTG AAC 

(SEQ ID NO:534) 
5'-P03-CGC CTT TAC 

(SEQ ID NO:536) 
5'-P03-CGA GGA GAC 

(SEQ ID NO:538) 
5'-P03-GCA TCT GAC 

(SEQ ID NO:540) 
5'-P03-TrG CTG CAC 

(SEQ ID NO: 542) 
5'-P03-ACT CCA CAC 

(SEQ ID NO:544) 
5'-P03-CTA CTG GAC 

(SEQ ID NO: 546) 
5'-P03-GTG CCA TAC 

(SEQ ID NO: 548) 
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5 '-P03-GGA CTG TGA 5 '-P03-ACA GTC CAC 

3.23 (SEQ ID NO: 549) (SEQ ID NO: 550) 
5 '-P03-CCG AAC TGA 5 '-P03-AGT TCG GAG 

3.24 (SEQ ID NO: 551) (SEQ ID NO; 552) 
5 '-P03-CTC CTC AGA 5 '-P03 -TGA GGA GAG 

3.25 (SEQ ID NO: 553) (SEQ ID NO:554) 
5'-P03-CAC TGC TGA 5'-P03-AGe AGT GAG 

3.26 (SEQ ID NO: 555) (SEQ ID NO: 556) 
5 '-P03-AGC AGG CGA 5 '-P03-GCC TGC TAG 

3.27 (SEQ ID NO: 557) (SEQ ID NO: 558) 
5'-P03-AGC AGG AGA 5'-P03-TCC TGC TAG 

3.28 (SEQ ID NO:559) (SEQ ID NO: 560) 
5'-P03-AGA GCC AGA 5'-P03-TGG CTC TAG 

3.29 (SEQ ID NO: 561) (SEQ ID NO: 562) 
5 '-P03-GTC GTT GGA 5 '-P03-CAA CGA CAC 

3.30 (SEQ ID NO:563) (SEQ ID NO:564) 
5 '-P03-CCG AAC GGA 5 '-P03-CGT TCG GAC 

3.31 (SEQ ID NO:565) (SEQ ID NO:566) 
5'-P03-CAC TGC GGA 5'-P03-CGC AGT GAC 

3.32 (SEQ ID NO: 567) (SEQ ID NO: 568) 
5'-P03-GTGGAGCGA 5'-P03-GCT CCA CAC 

3.33 (SEQ ID NO: 569) (SEQ ID NO: 570) 
5 '-P03-GTG GAG AGA 5 ^-P03-TCT CCA CAC 

3.34 (SEQ ID NO: 571) ' (SEQ ID NO: 572) 
5'-P03-GGA CTG CGA 5'-P03-GCA GTC CAC 

3.35 (SEQ ID NO: 573) (SEQ ID NO: 574) 
5 '-P03-CCG AAC CGA 5 '-P03-GGT TCG GAC 

3.36 (SEQ ID NO:575) ' (SEQ ID NO:576) 
5 '-P03-CAC TGC CGA 5 '-P03-GGC AGT GAC 

3.37 (SEQ ID NO: 577) (SEQ ID NO: 578 ) 
5'-P03-CGA AAC GGA 5'-P03-CGT TTC GAC 

3.38 (SEQ ID NO: 579) (SEQ ID NO: 580) 
5'-P03-GGA CTG AGA 5'-P03-TCA GTC CAC 

3.39 (SEQ ID NO:581) (SEQ ID NO: 582) 
5 '-P03-CCG AAC AGA 5 '-P03-TGT TCG GAC 

3.40 (SEQ- ID NO: 583) (SEQ ID NO: 584) 
5'-P03-CGA AAC CGA 5'-P03-GGT TTC GAC 

3.41 (SEQ ID NO: 585) (SEQ ID NO: 586) 
5 '-P03-CTG GCT TGA 5 '-P03-AAG CCA GAC 

3.42 (SEQ ID NO: 587) (SEQ ID NO: 58 8) 
5'-P03-CAC ACC TGA 5'-P03-AGG TGT GAC 

3.43 (SEQ ID NO: 589) (SEQ ID NO: 590) 
5'-P03-AACGACCGA 5'-P03-GGTCGTTAC 

3.44 (SEQ ID NO: 591) (SEQ ID NO: 592) • 
5 '-P03-ATC GAG CGA 5 '-P03-GCT GGA TAG 

3.45 (SEQ ID NO:593) (SEQ ID NO:594) 
5 '-P03-TGC GAA GGA 5'-P03-CTT CGC AAC 

3.46 (SEQ ID NO: 595) (SEQ ID NO: 596) 
5'-P03-TGC GAA CGA 5' -PCS -GTT CGC AAC 

3.47 (SEQ ID NO:597) (SEQ ID NO:598) 
5'-P03-CTGGCTGGA 5'-P03-CAGCCAGAC 

3.48 (SEQ ID NO: 599) (SEQ ID NO: 600) 
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5'-P03-CAC ACC GGA 5'-P03 

3.49 (SEQ IP NO: 601) (SEQ 
5'-P03-AGT OCA GGA 5'-P03 

3.50 (SEQ ID NO: 603) (SEQ 
5 '-P03-GAC CGT TGA 5 '-P03 

3.51 (SEQ ID NO: 605) (SEQ 
5 '-P03-GGT GAG TGA 5 '-P03 

3.52 (SEQ ID NO: 607) (SEQ 
5'-P03-CCT TCC TGA 5'-P03 

3.53 (SEQ ID NO: 609) (SEQ 
5 '-P03-CTG GCT AGA 5 '-P03 

3.54 (SEQ ID NO: 611) (SEQ 
5'-P03-CAC ACC AGA 5'-P03 

3.55 (SEQ ID NO: 613) (SEQ 
5'-P03-AGC GGT AGA 5'-P03 

3.56 (SEQ ID NO: 615) (SEQ 
5'-P03-GTC AGA GGA 5'-P03 

3.57 (SEQ ID NO: 617) (SEQ 
5'-P03-TTCCGACGA 5'-P03 

3.58 (SEQ ID NO: 619) (SEQ 
5'-P03-AGGCGTAGA 5'-P03 

3.59 (SEQ ID NO: 621) (SEQ 
5'-P03-CTC GAC TGA 5'-P03 

3.60 (SEQ ID NO: 623) (SEQ 



■CGGTGTGAC 

ID NO: 602) 
■CTG CAC TAG 

ID NO: 604) 
-AAC GGT CAC 

ID NO: 606) 
■ACT CAC CAC 

ID NO: 608) 
■AGG AAG GAC 

ID NO: 610) 
■TAG CCA GAC 

ID NO: 612) 
-TGG TGT GAC 

ID NO: 614) 
-TAC CGC TAC 

ID NO: 616) 
■CTCTGACAC 

ID NO: 618) 
-GTC GGA AAC 

ID NO: 620) 
-TACGCCTAC 

ID NO: 622) 
■AGTCGAGAC 

ID NO: 624) 



5 '-P03-TAC GCT GGA 5 '-P03- 

3.61 (SEQ ID NO: 625) (SEQ 
5'-P03-GTT CGG TGA 5'-P03- 

3.62 (SEQ ID NO: 627) (SEQ 
5'-P03-GCC AGC AGA 5'-P03- 

3.63 (SEQ ID NO: 629) (SEQ 
5'-P03-GACCGT AGA 5'-P03- 

3.64 (SEQ ID NO: 631) (SEQ 
5'-P03-GTGCTCTGA 5'-P03- 

3.65 (SEQ ID NO: 633) (SEQ 
5'-P03-GGT GAG CGA 5'-P03- 

3.66 (SEQ ID NO: 635) (SEQ 
5'-P03-GGTGAGAGA 5'-P03 

3.67 (SEQ ID NO: 637) (SEQ 
5'-P03-CCTTCC AGA 5'-P03- 

3.68 (SEQ ID NO: 639) (SEQ 
5'-P03-CTC CTA CGA 5'-P03- 

3.69 (SEQ ID NO: 641) (SEQ 
5'-P03-CTCGACGGA 5'-P03- 

3.70 (SEQ ID NO: 643) (SEQ 
5 '-P03-GCC GTT TGA 5 '-P03- 

3.71 (SEQ ID NO: 645) (SEQ 
5'-P03-GCG GAG TGA 5'-P03- 

3.72 (SEQ ID NO: 647) (SEQ 



CAG CGT AAC 
ID NO: 626) 
ACC GAA CAC 
ID NO: 628) 
TGC TGG CAC 
ID NO:630) 
TAC GGT CAC 
ID NO: 632) 
AGA GCA CAC 
ID NO: 634) 
GCT CAC CAC 
ID NO: 636) 
TCT CAC CAC 
ID NO: 638) 
TGG AAG GAC 
ID NO: 640) 
GTA GGA GAC 
ID NO: 642) 
CGT CGA GAC 
ID NO: 644) 
AAA CGG CAC 
ID NO: 646) 
ACT CCG CAC 
ID NO: 648) 



5'-P03-CGTGCTTGA 5'-P03 

3.73 (SEQ ID NO: 649) (SEQ 
5'-P03-CTC GAC CGA 5'-P03 

3.74 (SEQ ID NO: 651) (SEQ 



AAG CAC GAC 
ID NO: 650) 
GGT CGA GAC 
ID NO: 652) 
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5'-P03-AGAGCAGGA 5'-P03-CTG CTC TAG 

3.75 (SEQ ID NO: 653) (SEQ ID NO: 654) 
5 '-P03-GTG CTC GGA 5 '-P03-CGA GCA CAC 

3.76 (SEQ ID NO: 655) (SEQ ID NO: 656) 
5'-P03-CTC GAG AGA 5'-P03-TGT CGA GAG 

3.77 (SEQ ID NO: 657) (SEQ ID NO: 658) 
5'-P03-GGA GAG TGA 5'-P03-ACT CTC CAC 

3.78 (SEQ ID NO: 659) (SEQ ID NO: 660) 
5 '-P03-AGG CTG TGA 5 '-P03-ACA GCC TAG 

3.79 (SEQ ID NO: 661) (SEQ ID NO: 662) 
5'-P03-AGA GCA CGA 5'-P03-GTG CTC TAG 

3.80 (SEQ ID NO: 663) (SEQ ID NO: 664) 
5'-P03-CCA TCC TGA 5'-P03-AGG ATG GAG 

3.81 (SEQ ID NO: 665) (SEQ ID NO: 666) 
5'-P03-GTTCGGAGA 5'-P03-TCC GAA CAC 

3.82 (SEQ ID NO: 667) (SEQ ID NO: 668) 
5'-P03-TGGTAGCGA 5'-P03-GCT ACC AAC 

3.83 (SEQ ID NO: 669) (SEQ ID NO: 670) 
5'-P03-GTG CTC CGA 5'-P03-GGA GCA CAC 

3.84 (SEQ ID NO: 671) (SEQ ID NO: 672) 
5 '-P03-GTG CTC AGA 5 '-P03-TGA GCA CAC 

3.85 (SEQ ID NO: 673) (SEQ ID NO: 674) 
5 '-P03-GCC GTT GGA 5'-P03.CAA CGG CAC 

3.86 (SEQ ID NO: 675) (SEQ ID NO: 676) 
5'-P03-GAG TGC TGA 5 '-P03-AGC ACT CAC 

3.87 (SEQ ID NO: 677) (SEQ ID NO: 678) 
5'-P03-GCT CCT TGA 5'-P03-AAG GAG CAC 

3.88 (SEQ ID NO: 679) (SEQ ID NO: 680> 
5'-P03-CCGAAAGGA 5'-P03-CTT TCG GAC 

3.89 (SEQ ID NO: 681) (SEQ ID NO: 682) 
5'-P03-CAC TGA GGA 5'-P03-CTC AGT GAC 

3.90 (SEQ ID NO: 683) (SEQ ID NO: 684) 
5'-P03-CGT GCT GGA 5'-P03-CAG CAC GAC 

3.91 (SEQ ID NO: 685) (SEQ ID NO: 686) 
5'-P03-CCG AAA CGA 5'-P03-GTT TCG GAC 

3.92 (SEQ ID NO: 687) (SEQ ID NO: 688) 
5'-P03-GCG GAG AGA 5'-P03-TCT CCG CAC 

3.93 (SEQ ID NO: 689) (SEQ ID NO: 690) 
5'-P03-GCC GTT AGA 5'-P03-TAA CGG CAC 

3.94 (SEQ ID NO: 691) (SEQ ID NO: 692) 
5 '-P03-TCT CGT GGA 5 '-P03-CAC GAG AAC 

3.95 (SEQ ID NO: 693) (SEQ ID NO: 694) 
5 '-P03 -CGT GCT AGA 5 '-P03 -TAG CAC GAC 

3.96 (SEQ ID NO: 695) (SEQ ID NO: 696) 



Table 6. Oligonucleotide tags used in cycle 4 



Tag Bottom strand 

number Top strand sequence sequence 

5'-P03-GCCTGTCTT 5'-P03-GAC AGG CTC 

4.1 (SEQ ID NO: 697) (SEQ ID NO: 698) 
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5'-P03-CTCCTGGTT 5'-P03- 

4.2 (SEQ ID NO: 699) (SEQ 
5'-P03-ACTCTGCTT 5'-P03 

4.3 (SEQ ID NO: 701) (SEQ 
5'-P03-CATCGCCTT 5'-P03- 

4.4 (SEQ ID NO: 703) (SEQ 
5'-P03-GCCACTATT 5'-P03- 

4.5 (SEQ ID NO: 705) (SEQ 
5'-P03-CACACGGTT 5'-P03- 

4.6 (SEQ ID NO: 707) (SEQ 
5'-P03-CAACGCCTT 5'-P03- 

4.7 (SEQ ID NO: 709) (SEQ 
5'-P03-ACTGAGGTT 5'-P03- 

4.8 (SEQ ID NO: 711) (SEQ 
5'-P03-GTGCTGGTT S'-POS- 

4.9 (SEQ ID NO: 713) (SEQ 
5'-P03-CATCGACTT 5'-P03- 

4.10 (SEQ ID NO: 715) (SEQ 
5'-P03-CCATCGGTT 5'-P03- 

4.11 (SEQ ID NO: 717) (SEQ 
5'-P03-GCTGCACTT 5'-P03- 

4.12 (SEQ ID NO: 719) (SEQ 



CCAGGAGTC 
ID NO:700) 
GCA GAG TTC 
ID NO:702) 
GGC GAT GTG 
ID NO:704) 
TAGTGGCTC 
ID NO:706) 
CCGTGTGTC 
ID NO:708) 
GGC GTT GTC 
ID NO: 710) 
CCTCAGTTC 
ID NO:712) 
CCAGGAGTC 
ID NO:714) 
GTC GAT GTC 
ID NO:716) 
CCGATGGTC 
ID NO:718) 
QTGCAGCTC 
ID NO:720) 



5'-P03-ACAGAGGTT 5'-P03- 

4.13 (SEQ ID NO: 721) (SEQ 
5'-P03-AGTGCCGTT 5'-P03. 

4.14 (SEQ ID NO: 723) (SEQ 
5'-P03-CGGACATTT 5'-P03 

4.15 (SEQ ID NO:725) (SEQ 
5'-P03-GGTCTGGTT 5'-P03- 

4.16 (SEQ ID NO: 727) (SEQ 
5'-P03-GAGACGGTT 5'-P03- 

4.17 (SEQ ID NO: 729) (SEQ 
5'-P03-CTTTCCGTT 5'-P03- 

4.18 (SEQ ID NO: 731) (SEQ 
5'-P03-CAGATGGTT 5'-P03- 

4.19 (SEQ ID NO: 733) (SEQ 
5'-P03-CGGACACTT 5'-P03- 

4.20 (SEQ ID NO: 735) (SEQ 
5'-P03-ACTCTCGTT 5'-P03- 

4.21 (SEQ ID NO: 737) (SEQ 
5'-P03-GCAGCACTT 5'-P03- 

4.22 (SEQ ID NO: 739) (SEQ 
5'-P03-ACTCTCCTT 5'-P03- 

4.23 (SEQ ID NO: 741) (SEQ 
5'-P03-ACCTTGGTT 5'-P03- 

4.24 (SEQ ID NO:743) (SEQ 



CCTCTGTTC 
ID NO:722) 
CGG CAC TTC 
ID NO:724) 
ATGTCCGTC 
ID NO:726) 
CCAGACCTC 
ID NO:728) 
CCG TCT CTC 
ID NO:730) 
CGG AAA GTC 
ID NO: 732) 
CCA TCT GTC 
ID NO: 734) . 
GTGTCCGTC 
ID NO: 736) 
CGA GAG TTC 
ID NO:738) 
GTG CTG CTC 
ID NO:740) 
GGAGAGTTC 
ID NO:742) 
CCAAGGTTC 
ID NO;744) 



5'-P03-AGAGCCGTT 5'-P03 

4.25 (SEQ ID NO: 745) (SEQ 
5'-P03-ACCTTGCTT 5'-P03 

4.26 (SEQ ID NO: 747) (SEQ 
5'-P03-AAGTCCGTT 5'-P03- 

4.27 (SEQ ID NO: 749) (SEQ 



CGG CTC TTC 
ID NO: 746) 
GCAAGGTTC 
ID NO:748) 
CGG ACT TTC 
ID NO:750) 
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5'-P03-GGA CTG GTT 

4.28 (SEQ ID NO: 751) 
5'-P03-GTCGTTCTT 

4.29 (SEQ ID NO: 753) 
5 '-P03-CAGCATCTT 

4.30 (SEQ ID NO: 755) 
5 '-P03-CTATCCGTT 

4.31 (SEQ ID NO: 757) 
5'-P03-ACACTCGTT 

4.32 (SEQ ID NO: 759) 
5 '-P03-ATCCAGGTT 

4.33 (SEQ ID NO: 7 61) 
5 '-P03-GTTCCTGTT 

4.34 (SEQ ID NO: 7 63) 
5'-P03-ACACTCCTT 

4.35 (SEQ ID NO: 765) 
5 '-P03-GTTCCTCTT 

4.36 (SEQ ID NO: 7 67) 



5'-P03-CCA GTG CTG 

(SEQ ID NO: 752) 
5'-P03-GAA CGA CTG 

(SEQ ID NO: 754) 
5'-P03-GAT GCT GTG 

(SEQ ID NO:756) 
5 '-P03-CGG ATA GTG 

(SEQ ID NO: 758) 
5'-P03-CGAGTGTTC 

(SEQ ID NO:760) 
5'-P03-CCT GGA TTC 

(SEQ ID NO: 762) 
5'-P03-CAG GAA CTG 

(SEQ ID NO: 764) 
5'-P03-GGA GTG TTC 

(SEQ ID NO:766) 
5'-P03-GAG GAA CTC 

(SEQ ID NO: 7 68) 



5 '-P03-CTGGCTCTT 

4.37 (SEQ ID NO: 7 69) 
5'-P03-ACGGCATTT 

4.38 (SEQ ID NO: 771) 
5'-P03-GGTGAGGTT 

4.39 (SEQ II> NO: 773) 
5 '-P03-CCTTCCGTT 

4.40 (SEQ ID NO: 775) 
5'-P03-TACGCTCTT 

4.41 (SEQ ID NO: 777) 
5 '.-P03-ACGGCAGTT 

4.42 (SEQ ID NO: 779) 
5 ' -P03 - ACTGACGTT 

4.43 (SEQ ID NO: 781) 
5 '-P03-ACGGCACTT 

4.44 (SEQ ID NO: 783) 
5'-P03-ACTGACCTT 

4.45 (SEQ ID NO: 785) 
5 '-P03-TTTGCGGTT 

4.46 (SEQ ID NO: 7 87) 
5'-P03-TGGTAGGTT 

4.47 (SEQ ID NO: 789) 
5'-P03-GTTCGGCTT 

4.48 (SEQ ID NO: 7 91) 



5'-P03-GAG CCA GTG 

(SEQ ID NO: 770) 
5'-P03-ATG CCG TTC 

(SEQ ID NO:772) 
5'-P03-CCT CAC CTG 

(SEQ ID NO:774) 
5'-P03-CGG AAG GTG 

(SEQ ID NO: 776) 
5'-P03-GAG CGT ATC 

(SEQ ID NO:778) ■ 
5'-P03-CTGCCGTTC 

(SEQ ID NO: 780 
5'-P03-CGT CAG TTC 

(SEQ ID NO:782) 
5"-P03-GTG CCG TTC 

(SEQ ID NO:784) 
5'-P03-GGT CAG TTC 

(SEQ ID NO: 786) 
5'-P03-CCG CAA ATC 

(SEQ ID NO:788) 
5'-P03-CCT ACC ATC 

(SEQ ID NO: 790) 
5'-P03-GCC GAA CTC 

(SEQ ID NO: 792) 



4.49 
4.50 
4.51 
4.52 
4.53 



5'-P03- 

(SEQ 
5'-P03- 
(SEQ 
5'-P03- 
(SEQ 
5'-P03 
(SEQ 
5'-P03- 
(SEQ 



GCC GTT CTT 

ID NO:793) 
GGAGAGGTT 
ID N0:795) 
CACTGACTT 
ID NO:797) 
CGTGCTCTT 
ID NO:799) 
AATCCGCTT 
ID NO:801) 



5'-P03-GAA CGG CTC 

(SEQ ID NO:794) 
5'-P03-CCTCTCCTC 

(SEQ ID NO:796) 
5'-P03-GTC AGT GTC 

(SEQ ID NO: 798) 
5'-P03-GAG CAC GTC 

(SEQ ID NO: 800) 
5 ' -P03 -GCGGATTTC 

(SEQ ID NO:802) 
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5'-P03-AGGCTGGTT 

4.54 (SEQ ID NO: 803) 
5'-P03-GCTAGTGTT 

4.55 (SEQ ID NO: 805) 
5 '-P03-GGAGAGCTT 

4.56 (SEQ ID NO: 807) 
5 '-P03-GGAGAGATT 

4.57 (SEQ ID NO: 809) 
5'-P03-AGGCTGCTT 

4.58 (SEQ ID NO: 811) 
5 '-P03-GAGTGCGTT 

4.59 (SEQ ID NO: 813) 
5 '-P03-CCATCCATT 

4.60 (SEQ ID NO: 815) 



5'-P03-CCA GCC TTC 

(SEQ ID NO: 804) 
5'-P03-CAC TAG CTC 

(SEQ ID NO: 806) 
5'-P03-GCT CTC CTC 

(SEQ ID NO:808) 
5'-P03-TCT CTC CTC 

(SEQ ID NO:810) 
5'-P03-GCA GCC TTC 

(SEQ ID NO: 812) 
5'-P03-CGC ACT CTC 

(SEQ ID NO:814) 
5'-P03-TGG ATG GTC 

(SEQ ID NO: 816) 



5 '-P03-GCTAGTCTT 

4.61 (SEQ ID NO: 817) 
5'-P03-AGGCTGATT 

4.62 (SEQ ID NO: 819) 
5 ' -P03 - AC AGACGTT 

4.63 (SEQ ID NO: 821) 
5 '-P03-GAGTGCCTT 

4.64 (SEQ ID NO: 823) 
5 '-P03-ACAGACCTT 

4.65 (SEQ ID NO: 825) 
5'-P03-CGAGCTTTT 

4.66 (SEQ ID NO: 827) 
5 '-P03-TTAGCGGTT 

4.67 (SEQ ID NO: 829) 
5'-P03-CCTCTTGTT 

4.68 (SEQ ID NO: 831) 
5'-P03-GGTCTCTTT 

4.69 (SEQ ID NO: 833) 
5 '-P03-GCCAGATTT 

4.70 (SEQ ID NO: 835) 
5'-P03-GAGACCTTT 

4.71 (SEQ ID NO: 837) 
5 '-P03-CACACAGTT 

4.72 (SEQ ID NO: 839) 



5'-P03-GAC TAG CTC 

(SEQ ID NO: 818) 
5'-P03-TCA GCC TTC 

(SEQ ID NO:820) 
5'-P03-CGT CTG TTC 

(SEQ ID NO: 822) 
5'-P03-GGC ACT CTC 

(SEQ ID NO: 824) 
5'-P03-GGT CTG TTC 

(SEQ ID NO: 826) 
5'-P03-AAG CTC GTC 

(SEQ ID NO:828) 
5'-P03-CCG CTA ATC 

(SEQ ID NO: 830) 
5'-P03-CAA GAG GTC 

(SEQ ID NO: 832) 
5'-P03-AGA GAC CTC 

(SEQ ID NO:834) 
5'-P03-ATC TGG CTC 

(SEQ ID NO: 836) 
5'-P03-AGG TCT CTC 

(SEQ ID NO:838) 
5'-P03-CTG TGT GTC 

(SEQ ID NO: 840) 



5'-P03-CCTCTTCTT 

4.73 (SEQ ID NO: 841) 
5'-P03-TAGAGCGTT 

4.74 (SEQ ID NO: 843) 
5'-P03-GCACCTTTT 

4.75 (SEQ ID NO: 84 5) 
5'-P03-GGCTTGTTT 

4.76 (SEQ ID NO: 847) 
5 ' -P03 -GACGCGATT 

4.77 (SEQ ID NO: 849) 
5'-P03-CGAGCTGTT 

4.78 (SEQ ID NO: 851) 
5'-P03-TAGAGCCTT 

4.79 (SEQ ID NO: 853) 



5'-P03-GAA GAG GTC 

(SEQ ID NO: 842) 
5'-P03-CGC TCT ATC 

(SEQ ID NO:844) 
5'-P03-AAG GTG CTC 

(SEQ ID NO:846) 
5'-P03-ACA AGC CTC 

(SEQ ID NO:848) 
5'-P03-TCG CGT CTC 

(SEQ ID NO:850) 
5'-P03-CAG CTC GTC 

(SEQ ID NO: 852) 
5'-P03-GGC TCT ATC 

(SEQ ID NO:854) 
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5'-P03-CATCCGTTT 5'-P03 

4.80 (SEQ ID NO: 855) (SEQ 
5'-P03-GGTCTCGTT 5'-P03 

4.81 (SEQ ID NO: 857) (SEQ 
5'-P03-GCCAGAGTT 5'-P03 

4.82 (SEQ ID NO: 859) (SEQ 
5'-P03-GAGACCGTT 5'-P03 

4.83 (SEQ ID NO: 861) (SEQ 
5'-P03-CGAGCTATT 5'-P03 

4.84 (SEQ ID NO: 8 63) (SEQ 



-ACG GAT GTC 
ID NO:856) 
■CGAGACCtC 
ID NO:858) 
■CTC TGG CTC 
ID NO:860) 
■CGG TCT CTC 
ID NO:862) 
-TAG CTC GTC 
ID NO:864) 



5'-P03-GCAAGTGTT 5'-P03 

4.85 (SEQ ID NO: 865) (SEQ 
5'-P03-GGTCTCCTT 5'-P03 

4.86 (SEQ ID NO: 8 67) (SEQ 
5'-P03-GCCAGACTT 5'-P03 

4.87 (SEQ ID NO: 869) (SEQ 
5'-P03-GGTCTCATT 5'-P03 

4.88 (SEQ ID NO: 871) (SEQ 
5'-P03-GAGACCATT 5'-P03 

4.89 (SEQ ID NO: 873) (SEQ 
5'-P03-CCTTCAGTT 5'-P03 

4.90 (SEQ ID NO: 875) (SEQ 
5'-P03-GCACCTGTT 5'-P03 

4.91 (SEQ ID NO: 877) (SEQ 
5'-P03-AAAGGCGTT 5'-P03 

4.92 (SEQ ID NO: 87 9) (SEQ 
5'-P03-CAGATCGTT 5'-P03 

4.93 (SEQ ID NO: 881) (SEQ 
5'-P03-CATAGGCTT 5'-P03 

4.94 (SEQ ID NO: 883) (SEQ 
5'-P03-CCTTCACTT 5'-P03 

4.95 (SEQ ID NO: 885) (SEQ 
5'-P03-GCACCTCTT 5'-P03 

4.96 (SEQ ID NO: 887) (SEQ 



CAC TTG CTC 
ID NO:866) 
GGAGACCTC 
ID NO: 868) 
GTC TGG CTC 
ID NO:870) 
TGAGACCTC 
ID NO: 872) 
TGG TCT CTC 
ID NO: 874) 
CTGAAGGTC 
ID NO:876) 
CAG GTG CTC 
ID NO: 878) 

-CGC CTT TTC 
ID NO:880) 
CGATCTGTC 
ID NO:882) 
GCC TAT GTC 
ID NO:884) 

-GTG AAG GTC 
ID NO:886) 
GAG GTG CTC 
ID NO:888) 



Table 7: Correspondence between building blocks and oligonucleotide tags for 
Cycles 1-4. 



Building 
block 


Cycle 1 


Cycle 2 


Cycle 3 


Cycle 4 


BBl 


1.1 


2.1 


3.1 


4.1 


BB2 


1.2 


2.2 


3.2 


4.2 


BBS 


1.3 


2.3 


3.3 


4.3 


BB4 


1.4 


2.4 


3.4 


4.4 


BBS 


1.5 


2.5 


3.5 


4.5 


BB6 


1.6 


2.6 


3.6 


4.6 


BB7 


1.7 


2.7 


3.7 


4.7 
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BB8 


1.8 


2.8 


3.8 


4.8 


BB9 


1.9 


2.9 


3.9 


4.9 


BBIO 


1.10 


2.10 


3.10 


4.10 


BBll 


1.11 


2.11 


3.11 


4.11 


BB12 


1.12 


2.12 


3.12 


4.12 


BB13 


1.13 


2.13 


3.13 


4.13 


BB14 


1.14 


2.14 


3.14 


4.14 


BB15 


1.15 


2.15 


3.15 


4.15 


BB16 


1.16 


2.16 


3.16 


4.16 


BB17 


i.i? 


2.17 


3.17 


4.17 


BB18 


1.18 


2.18 


3.18 


4.18 


BB19 


1.19 


2.19 


3.19 


4.19 


BB20 


1.20 


2.20 


3.20 


4.20 


BB21 


1.21 


2.21 


3.21 


4.21 


BB22 


1.22 


2.22 


3.22 


4.22 


BB23 


1.23 


2.23 


3.23 


4.23 


BB24 


1.24 


2.24 


3.24 


4.24 


BB25 


1.25 


2.25 


3.25 


4.25 


BB26 


1.26 


2.26 


3.26 


4.26 


BB27 


1.27 


2.27 


3.27 


4.27 


BB28 


1.28 


2.28 


3.28 


4.28 


BB29 


1.29 


2.29 


3.29 


4.29 


BB30 


1.30 


2.30 


3.30 


4.30 


BB31 


1.31 


2.31 


3.31 


4.31 


BB32 


1.32 


2.32 


3.32 


4.32 


BB33 


1.33 


2.33 


3.33 


4.33 


BB34 


1.34 


2.34 


3.34 


4.34 


BB35 


1.35 


2.35 


3.35 


4.35 


BB36 


1.36 


2.36 


3.36 


4.36 


BB37 


1.37 


2.37 


3.37 


4.37 


BB38 


1.38 


2.38 


3.38 


4.38 



-100- 



wo 2007/053358 



PCT/US2006/041356 



BB39 


1.39 


2.39 


3.39 


4.39 


BB40 


1.44 


2.44 


3.44 


4.44 


BB41 


1.41 


2.41 


3.41 


4.41 


BB42 


1,42 


2.42 


3.42 


4.42 


BB43 


1.43 


2.43 


3.43 


4.43 


BB44 


1.40 


2.40 


3.40 


4.40 


BB45 


1.45 


2.45 


3.45 


4.45 


BB46 


1.46 


2.46 


3.46 


4.46 


BB47 


1.47 


2.47 


3.47 


4.47 


BB48 


1.48 


2.48 


3.48 


4.48 


BB49 


1.49 


2.49 


3.49 


4.49 


BB50 


1.50 


2.50 


3.50 


4.50 


BB51 


1.51 


2.51 


3.51 


4.51 


BB52 


1.52 


2.52 


3.52 


4.52 


BB53 


1.53 


2.53 


3.53 


4.53 


BB54 


1.54 


2.54 


3.54 


4.54 


BB55 


1.55 


2.55 


3.55 


4.55 


BB56 


1.56 


2.56 


3.56 


4.56 


BB57 


1.57 


2.57 


3.57 


4,57 


BB58 


1.58 


2.58 


3.58 


4.58 


BB59 


1.59 


2.59 


3.59 


4.59 


BB60 


1.60 


2.60 


3.60 


4.60 


BB61 


1.61 


2.61 


3.61 


4.61 


BB62 


1.62 


2.62 


3.62 


4.62 


BB63 


1.63 


2,63 


3.63 


4.63 


BB64 


1.64 


2.64 


3.64 


4.64 


BB65 


1.65 


2.65 


3.65 


4.65 


BB66 


1.66 


2.66 


3.66 


4.66 


BB67 


1.67 


2.67 


3.67 


4.67 


BB68 


1.68 


2,68 


3.68 


4.68 


BB69 


1.69 


2.69 


3.69 


4.69 
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BB70 


1.70 


2.70 


3.70 


4.70 


BB71 


1.71 


2.71 


3.71 


4.71 


BB72 


1.72 


2.72 


3.72 


4.72 


BB73 


1.73 


2.73 


3.73 


4.73 


BB74 


1.74 


2.74 


3.74 


4.74 


BB75 


1.75 


2.75 


3.75 


4.75 


BB76 


1.76 


2.76 


3.76 


4.76 


BB77 


1.77 


2.77 


3.77 


4.77 


BB78 


1.78 


2.78 


3.78 


4.78 


BB79 


1.79 


2.79 


3.79 


4.79 


BB80 


1.80 


2.80 


3.80 


4.80 


BB81 


1.81 


2.81 


3.81 


4.81 


BB82 


1.82 


2.82 


3.82 


4.82 


BB83 


1.96 


2.96 


3.96 


4.96 


BB84 


1.83 


2.83 


3.83 


4.83 


BB85 


1.84 


2.84 


3.84 


4.84 


BB86 


1.85 


2.85 


3.85 


4.85 


BB87 


1.86 


2.86 


3.86 


4.86 


BB88 


1.87 


2.87 


3.87 


4.87 


BB89 


1.88 


2.88 


3.88 


4.88 


BB90 


1.89 


2.89 


3.89 


4.89 


BB91 


1.90 


2.90 


3.90 


4.90 


BB92 


1.91 


2.91 


3.91 


4.91 


BB93 


1.92 


2.92 


3.92 


4.92 


BB94 


1.93 


2.93 


3.93 


4.93 


BB95 


1.94 


2.94 


3.94 


4.94 


BB96 


1.95 


2.95 


3.95 


4.95 



IX ligase buffer: 50 mM Tris, pH 7.5; 10 mM dithiothreitol; 10 mM MgClz; 2mM 
ATP; SOmMNaCl. 

5 
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lOX ligase buffer: 500 mM Tris, pH 7.5; 100 niM dithiothreitol; 100 mM MgCb; 20 
mM ATP; 500 mM NaCl 

Attachment of Water Soluble Spacer to Compound 2 
5 To a solution of Compound 2 (60 mL, 1 roM) in sodium borate buffer (150 

mM, pH 9.4) that was chilled to 4 °C was added 40 equivalents of N-Fmoc-15-amino- 
4,7,10,13-tetraoxaoctadecanoic acid (S-Ado) in N,N-dimethylfonnamide (DMF) (16 
mL, 0.15 M) followed by 40 equivalents of 4-(4,6-dimethoxy[1.3.5]triazin-2-yl)-4- 
methylmorpholinium chloride hydrate (DMTMM) in water (9.6 mL, 0.25 M). The 

10 mixture was gently shaken for 2 hours at 4 °C before an additional 40 equivalents of 
S-Ado and DMTMM were added and shaken for a further 16 hours at 4 ^C. 

Following acylation, a O.IX volume of 5 M aqueous. NaCl and a 2.5X volume 
of cold (-20 ^C) ethanol was added and the mixture was allowed to stand at -20 °C for 
at least one hour. The mixture was then centrifuged for 15 minutes at 14,000 rpm in a 

15 4 °C centrifuge to give a white pellet which was washed with cold EtOH and then 
dried in a lyophilizer at room temperature for 30 minutes. The solid was dissolved in 
40 mL of water and purified by Reverse Phase HPLC with a Waters Xterra RPig 
column. A binary mobile phase gradient profile was used to elute the product using a 
50 mM aqueous triethylammonium acetate buffer at pH 7.5 and 99% acetontrile/1% 

20 water solution. The purified material was concentrated by lyophilization and the 

resulting residue was dissolved in 5 mL of water. A O.IX volume of piperidine was 
added to the solution and the mixture was gently shaken for 45 minutes at room 
temperature. The product was then purified by ethanol precipitation as described 
above and isolated by centrifugation. The resulting pellet was washed twice with cold 

25 EtOH and dried by lyophilization to give purified Compoimd 3. 

Cycle 1 

To each well in a 96 well plate was added 12.5 {4,L of a 4 mM solution of 
30 Compound 3 in water; 100 |J.L of a 1 mM solution of one of oligonucleotide tags 1.1 
to 1.96, as shown in Table 3 (the molar ratio of Compound 3 to tags was 1 :2). The 
plates were heated to 95°C for 1 minute and then cooled to 16^C over 10 minutes. To 
each well was added 10 |liL of lOX ligase buffer, 30 units T4 DNA ligase (1 luL of a 
30 unit/fxL solution (FermentasLife Science, Cat. No. EL0013)), 76.5 |li1 of watOT and 
35 the resulting solutions were incubated at 16 °C for 16 hours. 
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After the ligation reaction, 20 |aL of 5 M aqueous NaCl was added directly to 
each well, followed by 500 |liL cold (-20 ''C) ethanol, and held at -20 ""C for 1 hour. 
The plates were centrifugated for 1 hour at 32005" ^ Beckman Coulter AUegra 6R 
centrifuge using Beckman Microplus Carriers. The supernatant was carefully removed 

5 by inverting the plate and the pellet was washed with 70% aqueous cold ethanol at -20 
*^C. Each of the pellets was then dissolved in sodium borate buffer (50 |liL, 150 mM, 
pH 9,4) to a concentration of 1 mM and chilled to 4 °C. 

To each solution was added 40 equivalents of one of the 96 building block 
precursors in DMF (13 |J,L, 0.15 M) followed by 40 equivalents of DMT-MM in 

10 water (8 |liL, 0.25M), and the solutions were gently shaken at 4''C. After 2 hours, an 
additional 40 equivalents of one of each building block precursor and DMTMM were 
added and the solutions were gently shaken for 16 hours at 4 °C. Following acylation, 
10 equivalents of acetic acid-N-hydroxy-succinimide ester in DMF (2 fxL, 0.25M) 
was added to each solution and gently shaken for 10 minutes. 

15 Following acylation, the 96 reaction mixtures were pooled and 0. 1 volimie of 

5M aqueous NaCl and 2.5 volumes of cold absolute ethanol were added and the 
solution was allowed to stand at -20 °C for at least one hoxxr. The mixture was then 
centrifuged. Following centrifugation, as much supernatant as possible was removed 
with a micropipette, the pellet was washed with cold ethanol and centrifuged again. 

20 The supernatant was removed with a 200 (jJL pipet. Cold 70% ethanol was added to 
the tube, and the resulting mixture was centrifuged for 5 min at 4°C. 

The supernatant was removed and the remaining ethanol was removed by 
lyophilization at room temperature for 10 minutes. The pellet was then dissolved in 2 
mL of water and purified by Reverse Phase HPLC with a Waters Xterra RPis column. 

25 A binary mobile phase gradient profile was used to elute the library using a 50 mM 
aqueous triethylammonium acetate buffer at pH 7.5 and 99% acetontrile/1% water 
solution. The fractions containing the library were collected, pooled, and lyophilized. 
The resulting residue was dissolved in 2.5 mL of water and 250 |aL of piperidine was 
added. The solution was shaken gently for 45 minutes and then precipitated with 

30 ethanol as previously described. The resulting pellet was dried by lyophilization and 
then dissolved in sodium borate buffer (4.8 mL, 150 mM, pH 9.4) to a concentration 
of 1 mM. 
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The solution was chilled to 4 °C and 40 equivalents each of N-Fmoc- 
propargylglycine in DMF (1.2 mL, 0.15 M) and DMT-MM in water (7.7 mL, 0.25 M) 
were added. The mixture was gently shaken for 2 hovirs at 4 °C before an additional 
40 equivalents of N-Fmoc-propargylglycine and DMT-MM were added and the 
5 solution was shaken for a further 16 hours. The mixture was later purified by EtOH 
precipitation and Reverse Phase HPLC as described above and the N-Fmoc group was 
removed by treatment with piperidine as previously described. Upon final purification 
by EtOH precipitation, the resulting pellet was dried by lyophilization and carried into 
the next cycle of synthesis 

10 

Cycles 2-4 

For each of these cycles, the dried pellet firom the previous cycle waa 
dissolved in water and the concentration of library was determined by 
spectrophotometry based on the extinction coefficient of flie DNA component of the 

1 5 library, where the initial extinction coefficient of Compound 2 is 1 3 1 ,500 

L/(mole.cm). The concentration of the library was adjusted with water such that the 
final concentration in the subsequent ligation reactions was 0.25 niM. The library was 
then divided into 96 equal aliquots in a 96 well plate. To each well was added a 
solution comprising a different tag (molar ratio of the library to tag was 1 :2), aad 

20 Ugations were performed as described for Cycle 1 . Oligonucleotide tags used in 

Cycles 2, 3 aand 4 are set forth in Tables 4, 5 and 6, respectively. Correspondense 
between the tags and the building block precursors for each of Cycles 1 to 4 is 
provided in Table 7. The library was precipitated by the addition of ethanol as 
described above for Cycle 1, and dissolved in sodium borate buffer (150 mM, pH 9.4) 

25 to a concentration of 1 mM. Subsequent acylations and purifications were performed 
as described for Cycle 1, except HPLC purification was omitted during Cycle 3. 

The products of Cycle 4 were ligated with the closing primer shown below, 
using the method described above for ligation of tags. 

30 5'-P03-CAGAAGACAGACAAGCTTCACCTGC (SEQ ID NO: 88 9) 
5'-P03-GCAGGTGAAGCTTGTCTGTCTTCTGAA (SEQ ID NO: 8 90) 
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Results: 

The synthetic procedure described above has the capability of producing a 
library comprising 96^ (about 10^) different structures. The synthesis of the library 
5 was monitored via gel electrophoresis and LC/MS of the product of each cycle. Upon 
completion, the library was analyzed using several techniques. Figure 13a is a 
chromatogram of the hbrary following Cycle 4, but before ligation of the closing 
primer; Figure 13b is a mass spectrum of the Ubrary at the same synthetic stage. The 
average molecular weight was determined by negative ion LC/MS analysis. The ion 
10 signal was deconvoluted using ProMass software. This result is consistent with the 
predicted average mass of the library. 

The DNA component of the library was analyzed by agarose gel 
electrophoresis, which showed that the majority of library material corresponds to 
ligated product of the correct size. DNA sequence analysis of molecular clones of 
1 5 PCR product derived from a sampling, of the library shows that DNA ligation 
occurred with high fidelity and to near completion. 

Library cyclization 

At the completion of Cycle 4, a portion of the library was capped at the N- 

20 terminus using azidoacetic acid under the usual acylation conditions. The product, 
after purification by EtOH precipitation, was dissolved in sodium phosphate buffer 
(1 50 mM, pH 8) to a concentration of 1 niM and 4 equivalents each of CUSO4 in water 
(200 mM), ascorbic acid in water (200 mM), and a solution of the compound shown 
below in DMF (200 mM) were added. The reaction mixture was then gently shaken 

25 for 2 hours at room temperature. 




Ph 
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To assay the extent of cyclization, 5 liL aliquots from the library cycUzation 
reaction were removed and treated with a fluorescently-labeled azide or alkyne (l^L 
of 100 mM DMF stocks) prepared as described in Example 4. .After 16 hours, neither 
the alkyne or azide labels had been incorporated into the library by HPLC analysis at 
5 500 nm. This result indicated that the library no longer contained azide or alkyne 
groups capable of cycloaddition and that the library must therefore have reacted with 
itself, either through cyclization or intermolecular reactions. The cyclized library was 
purified by Reverse Phase HPLC as previously described. Control experiments using 
uncyclized library showed complete incorporation of the fluorescent tags mentioned 
10 above. 



Example 4: Preparation of Fluorescent Tags for Cyclization Assay: 

In separate tubes, propargyl glycine or 2-amino-3-phenylpropylazide (8 |Limol 
each) was combined with FAM-OSu (Molecular Probes hic.) (1.2 equiv.) in pH 9.4 
15 borate buffer (250 |liL). The reactions were allowed to proceed for 3 h at room 

temperature, and were then lyophilized ovemight. Purification by HPLC afforded the 
desired fluorescent alkyne and azide in quantitative >deld. 




-107- 



wo 2007/053358 



PCT/US2006/041356 



Example 5 : Cyclization of individual compoimds using the azide/alkyne 
cycloaddition reaction 



Preparation of Azidoacetyl-Gly-Pro-Phe-Pra-NH2: 
5 Using 0.3 mmol of Rink-amide resin, the indicated sequence was synthesized 

using standard solid phase synthesis techniques with Fmoc-protected amino acids and 
HATU as activating agent (Pra = C-propargylglycine). Azidoacetic acid was used to 
cap the tetrapeptide. The peptide was cleaved from the resin with 20% TFA/DCM for 
4 h. Purification by RP HPLC afforded product as a white solid (75 mg, 51%). 

10 NMR (DMSO-d6, 400 MHz): 8.4 - 7.8 (m, 3H), 7.4 - 7.1 (m, 7 H), 4.6 - 4.4 (m, IH), 
4.4 4.2 (m, 2H), 4.0 - 3.9 (m, 2H), 3.74 (dd, IH, J = 6 Hz, 17 Hz), 3.5 - 3.3 (m, 
2H), 3.07 (dt, IH, J = 5 Hz, 14 Hz), 2.92 (dd, IH, J = 5 Hz, 16 Hz), 2.86 (t, IH, J = 2 
Hz), 2.85 - 2.75 (m, IH), 2.6 - 2.4 (m, 2H), 2.2 - 1.6 (m, 4H), IR (mull) 2900, 2100, 
1450, 1300 cm \ ESIMS 497.4 ([M+H], 100%), 993.4 ([2M+H], 50%). ESIMS with 

15 ion-source fragmentation: 519.3 ([M+Na], 100%), 491.3 (100%), 480.1 ([M-NH^], 
90%), 452.2 ([M-NH2-CO}, 20%), 424.2 (20%), 385,1 ([M-Pra], 50%), 357.1 ([M- 
Pra-CO], 40%), 238.0 ([M-Pra-Phe], 100%). 



20 Cyclization of Azidoacetyl-Gly-Pro-Phe-Pra-NHi: 




N-\ H O ^ Gu{MeCN)4PF6 

L-/ r.^.J H C(0)l 




The azidoacetyl peptide (31 mg, 0.62 nmaol) was dissolved in MeCN (30 mL). 

25 Diisopropylethylamine (DIEA, 1 mL) and Cu(MeC]Sr)4PF6 (1 mg) were added. After 
stirring for 1 .5 h, the solution was evaporated and the resulting residue was taken up 
in 20% MeCN/HzO. After centrifiigation to remove insoluble salts, the solution was 
subjected to preparative reverse phase HPLC. The desired cyclic peptide was isolated 
as a white solid (10 mg, 32%). ^H NMR (DMSO-d6, 400 MHz): 8.28 (t, IH, J = 5 

30 Hz), 7.77 (s, IH), 7.2 - 6.9 (m, 9H), 4.98 (m, 2H), 4,48 (m, IH), 4.28 (m, IH), 4.1 - 
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3.9 (m, 2H), 3.63 (dd, IH, J = 5 Hz, 16 Hz), 3.33 (m, 2H), 3.0 (m, 3H), 2.48 (dd^ IH, J 
= 11 Hz, 14 Hz), 1.75 (m, IHO, 1.55 (m, IH), 1.32 (m, IH), 1.05 (m, IH). IR (mull) 
2900, 1475, 1400 cxaK ESIMS 497.2 ([M+H], 100%), 993.2 ([2M+H], 30%), 1015.2 
([2M+Na], 15%). ESIMS with ion-sovirce fragmentation: 535.2 (70%), 519.3 
5 ([M+Na], 100%), 497.2 ([M+H], 80%), 480.1 ([M-NH2], 30%), 452.2 ([M-^EH2-CO], 
40%), 208.1 (60%). 

Preparation of Azidoacetyl-Gly-Pro-Phe-Pra-Gly-OH: 

Using 0.3 mmol of Glycine- Wang resin, the indicated sequence was 

10 synthesized using Fmoc-protected amino acids and HATU as the activating agent. 
Azidoacetic acid was used in the last coupling step p cap the pentapeptide. Cleavage 
of the peptide was achieved using 50% TFA/DCM for 2 h. Purification by RP HPLC 
afforded the peptide as a white solid (83 mg; 50%). ^H NMR (DMSO-dg, 400 MHz): 
8.4 - 7.9 (m, 4H), 7:2 (m, 5H), 4.7 - 4.2 (m, 3H), 4.0 - 3.7 (m, 4H), 3.5 - 3.3 (m, 2H), 

15 3.1 (m, IH), 2.91 (dd, IH, J = 4 Hz, 16 Hz), 2.84 (t, IH, J = 2.5 Hz), 2.78 (m, IH), 2.6 
- 2.4 (m, 2H), 2.2 - 1.6 (m, 4H). IR (mull) 2900, 2100, 1450, 1350 cm'^ ESIMS 
555.3 ([M+H], 100%). ESIMS with ion-source fragmentation: 577.1 ([M+Na], 90%), 
555.3 ([M+H], 80%), 480.1 ([M-Gly], 100%), 385.1 ([M-Gly-Pra], 70%), 357.1 ([M- 
Gly-Pra-CO], 40%), 238.0 ([M-Gly-Pra-Phe], 80%). 

20 

. Cyclization of Azidoacetyl-Gly-Pro-Phe-Pra-Gly-OH: 

The peptide (32 mg, 0.058 mmol) was dissolved in MeCN (60 mL). 
Diisopropylethylamine (1 mL) and Cu(MeCN)4PF6 (1 mg) were added and the 
solution was stirred for 2 h. The solvent was evaporated and the crude product was 
25 subjected to RP HPLC to remove dimers and trimers. The cyclic monomer was 
isolated as a colorless glass (6 mg, 20%). ESIMS 555.6 ([M+H], 100%), 1109.3 
([2M+H], 20%), 1131.2 ([2M+Na], 15%). 

ESIMS with ion source fragmentation: 555.3 ([M+H], 100%), 480.4 ([M-Gly], 30%), 
452.2 ([M-Gly-CO], 25%), 424.5 ([M-Gly-2CO], 10%, only possible in a cycUc 
30 structure). 



Conjugation of Linear Peptide to DNA: 

Compound 2 (45 nmol) was dissolved in 45 \iL sodimn borate buffer (pH 9.4; 
150 mM). At 4° C, linear peptide (18 ^Lof a lOOmMstockinDMF; 180 nmol; 40 
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equiv.) was added, followed by DMT-MM (3.6 p.L of a 500 inM stock in water; 180 
nmol; 40 equiv,). After agitating for 2 h, LCMS showed complete reaction, and 
product was isolated by ethanol precipitation. ESIMS 1823.0 ([M-3H]/3, 20%), 
1367.2 ([M-4H]/4, 20%), 1093.7 ([M-5H]/5, 40%), 911.4 ([M^6H]/6, 100%). 

5 

Conjugation of Cyclic Peptide to DNA: 

Compound 2 (20 nmol) was dissolved in 20 ]iL sodium borate buffer (pH 9.4, 
150 mM). At 4° C, linear peptide (8 liL of a 100 mM stock in DMF; 80 nmol; 40 
equiv.) was added, followed by DMT-MM (1 .6 liL of a 500 mM stock in water; 80 
10 nmol; 40 equiv.). After agitating for 2 h, LCMS showed complete reaction, and 
product was isolated by ethanol precipitation. ESIMS 1823.0 ([M-3H]/3, 20%), 
1367.2 ([M-4H]/4, 20%), 1093.7 ([M-5H]/5, 40%), 911.4 ([M-6H]/6, 100%). 

15 Cyclization of DNA-Linked Peptide: 

Linear peptide-DNA conjugate (10 nmol) was dissolved in pH 8 sodivim 
phosphate buffer (10 fiL, 150mm). At room temperature, 4 equivalents each of 
CuS04, ascorbic acid, and the Sharpless ligand were all added (0.2 [iL of 200 mM 
stocks). The reaction was allowed to proceed overnight. RP HPLC showed that no 

20 linear peptide-DNA was present, and that the product co-eluted with authentic cyclic 
peptide-DNA. No traces of dimers or other oligomers were observed. 
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^ 0=\ " 



NH 



MeCN,1.6h,rt / j^*!^^ — / HN-^ 




1 mM ONA-Unl(Inker. 
DMT-MM, pH 9.5 



CO2H 



1 mM DNA-Unllinker, 
DMT-MM, pH 9.5 



Ph 



ligand = 



elutes @ 4.48 min- 



elutes @ 4.27 min. 



LC conditions: Targa C18, 2.1 x 40 mm, 10-40% 
MeCN in 40mM aq. TEAA over 8 min. 



Example 6: Application of Aromatic Nucleophilc Substitution Reactions to 
5 Functional Moiety Synthesis 

General Procedure for Arylation of Compound 3 with Cyanuric Chloride: 

Compotind 2 is dissolved in pH 9.4 sodium borate buffer at a concentration of 
1 mM. The solution is cooled to 4° C and 20 equivalents of cyanuric chloride is then 
10 added as a 500 mM solution in MeCN. After 2h, complete reaction is confirmed by 
LCMS and the resulting dichlorotriazine-DNA conjugate is isolated by ethanol 
precipitation. 

Procedure for Amine Substitution of Dichlorotriazine-DNA: 
15 The dichlorotriazine-DNA conjugate is dissolved in pH 9.5 borate buffer at a 

concentration of 1 mM. At room temperatixre, 40 equivalents of an aliphatic amine is 
added as a DMF solution. The reaction is followed by LCMS and is usually complete 
after 2 h. The resulting alkylamino-monochlorotriazine-DNA conjugate is isolated by 
ethanol precipitation. 

20 
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Procedure for Amine Substitution of Monochlorotriazine-DNA: 

The alkylamino-monochlorotriazine-DNA conjugate is dissolved in pH 9.5 
borate buffer at a concentration of 1 wM. At 42^ C, 40 equivalents of a second 
aliphatic amine is added as a DMF solution. The reaction is followed by LCMS and is 
5 usually complete after 2 h. The resulting diaminotriazine-DNA conjugate is isolated 
by ethanol precipitation. 

Example 7: Application of Reductive Amination Reactions to Functional Moiety 
Synthesis 

10 

General Procedure for Reductive Amination of DNA-Linker Containing a Secondary 
Amine with an Aldehyde Building Block: 

Compound 2 was coupled to an N-terminal proline residue. The resulting 
compound was dissolved in sodium phosphate buffer (50 jliL, 150 mM, pH 5.5) at a 
15 concentration of 1 mM. To this solution was added 40 equivalents each of an 

aldehyde building block in DMF (8 |liL, 0.25M) and sodium cyanoborohydride in 
DMF (8 fxL, 0.25M) and the solution was heated at 80 ""C for 2 hours. Following 
alkylation, the solution was purified by ethanol precipitation. 

20 General Procedure for Reductive Aminations of DNA-Linker Containing an 
Aldehyde with Amine Building Blocks: 

Compoimd 2 coupled to a building block comprising an aldehyde group was 
dissolved in sodimn phosphate buffer (50 fiL, 250 mM, pH 5.5) at a concentration of 
25 1 mM. To this solution was added 40 equivalents each of an amine building block in 
DMF (8 [lU 0.25M) and sodium cyanoborohydride in DMF (8 0.25M) and the 
solution was heated at 80 for 2 hours. Following alkylation, the solution was 
purified by ethanol precipitation. 

30 
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Example 8: Application of Peptoid Building Reactions to Functional Moiety 
Synthesis 



General Procedure for Peptoid Synthesis on DNA-Liiiker: 



o 



H2N DNA-Linker ► '^'^-^J^'^'^DNA-Llnker ^ ''^-^N" "DNA-Linker 

40 eqivalents 40 eqivalents 



Compound 2 was dissolved in sodium borate buffer (50 jaL, 150 mM, pH 9.4> 
10 at a concentration of 1 mM and chilled to 4 **C. To this solution was added 40 

equivalents of N-hydroxysuccinimidyl bromoacetate in DMF (13 |liL, 0.15 M) and the 
solution was gently shaken at 4 °C for 2 hours. Following acylation, the DNA-Linker 
was purified by ethanol precipitation and redissolved in sodium borate buffer (50 |iL, 
150 mM, pH 9.4) at a concentration of 1 mM and chilled to 4 °C, To this solution was 
15 added 40 eqivalents of an amine building block in DMF (13 |liL, 0.15 M) and the 

solution was gently shaken at 4 for 16 hours. Following alkylation, the DNA-linker 
was purified by ethanol precipitation and redissolved in sodium borate buffer (50 |uiL, 
150 mM, pH 9.4) at a concentration of 1 mM and chilled to 4 Peptoid synthesis is 
continued by the stepwise addition of N-hydroxysuccinimidyl bromoacetate followed 
20 by the addition of an amine building block. 



Example 9: Application of the Azide-Alkyne Cycloaddition Reaction to Functional 
Moiety Synthesis 

25 

General procedure 

An alkyne-containing DNA conjugate is dissolved in pH 8.0 phosphate buffer 
at a concentration of ca. ImM. To this mixture is added 10 equivalents of an organic 
azide and 5 equivalents each of copper (II) sulfate, ascorbic acid, and the ligand (tris- 
30 ((l-benzyltriazol-4'-yl)methyl)amine all at room temperature. The reaction is followed 
by LCMS, and is usually complete after 1 — 2 h. The resulting triazole-DNA 
conjugate can be isolated by ethanol precipitation. 
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Example 10 Identification of a ligand to Abl kinase jSrom within an encoded library 

The ability to enrich molecules of interest in a DNA-encoded library above 
5 undesirable library members is paramount to identifying single compounds with 
defined properties against therapeutic targets of interest. To demonstrate this 
enrichment ability a known binding molecule (described by Shall et aL^ Science 305, 
399-401 (2004), incorporated herein by reference) to rhAbl kinase (GenBank 
U07563) was synthesized. This compound was attached to a double stranded DNA 

10 oligonucleotide via the linker described in the preceding examples using standard 
chemistry methods to produce a molecule similar (fimctional moiety linked to an 
oligonucleotide) to those produced via the methods described in Examples 1 and 2. A 
library generally produced as described in Example 2 and the DNA-linked Abl kinase 
binder were designed with unique DNA sequences that allowed qPCR analysis of 

15 both species. The DNA-linked Abl kinase binder was mixed Avith the library at a ratio 
of 1 : 1000. This mixture was equilibrated with to rhAble kinase, and the enzyme was 
captured on a solid phase, washed to remove non-binding library members and 
binding molecules were eluted. The ratio of library molecules to the DNA-linked Abl 
kinase inhibitor in the eluate was 1:1, indicating a greater than 500-fold enrichment of 

20 the DNA-linked Abl-kinase binder in a 1000-fold excess of library molecules. 

Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more 
than routine experimentation, many equivalents to the specific embodiments of the 
25 invention described herein. Such equivalents are intended to be encompassed by the 
following claims. 
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Claims 

1 . A method for identifying one or more compoimds which bind to a biological 
target, said method comprising: 
5 (A) synthesizing a library of compounds, wherein the compounds comprise a 

functional moiety comprising two or more building blocks which is operatively linked 
to an initial oligonucleotide which identifies the structure of the fimctional moiety by: 

(i) providing a solution comprising m initiator compounds, wherein m 
is an integer of 1 or greater, where the initiator compoimds consist of a functional 

10 moiety comprising n building blocks, where n is an integer of 1 or greater, which is 
operatively linked to an initial oligonucleotide which identifies the n building blocks; 

(ii) dividing the solution of step (i) into r reaction vessels, wherein r is 
an integer of 2 or greater, thereby producing r aliquots of the solution; 

(iii) reacting the initiator compounds in each reaction vessel with one 
15 of r building blocks, thereby producing r aliquots comprising compounds consisting 

of a functional moiety comprising n+1 bmlding blocks operatively linked to the initial 
oUgonucleotide; and 

(iv) reacting the initial oligonucleotide in each aliquot with one of a set 
of r distinct incoming oligonucleotides in the presence of an enzyme which catalyzes 

20 the ligation of the incoming oligonucleotide and the initial oligonucleotide, under 
conditions suitable for enzymatic ligation of the incoming oligonucleotide and the 
initial oligonucleotide; thereby producing r aliquots of molecules consisting of a 
functional moiety comprising n+l building blocks operatively linked to an elongated 
oligonucleotide which encodes the n+l building blocks; 

25 (B) contacting the biological target with the library of compounds, or a 

portion thereof, xmder conditions suitable for at least one member of the library of 
compounds to bind to the target; 

(C) removing library members that do not bind to the target; 

(D) sequencing the encoding oligonucleotides of the at least one member of 
30 the library of compoimds which binds to the target, and 

(E) using the sequences determined in step (D) to determine the structure of 
the functional moieties of the members of the library of compounds which bind to the 
biological target, thereby identifying one or more compoimds which bind to the 
biological target. 
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2. The method of claim 1, further comprising amplifying the encoding 
oligonucleotides of the at least one member of the library of compounds which binds 
to the target. 

3 . The method of claim 2, wherein said amplifying stqp comprises: 

(i) forming a water-in-oil emulsion to create a plurality of aqueous 
microreactors, wherein at least one of the microreactors comprises the at least one 
member of the library of compounds that bmds to the target, a single bead capable of 
binding to the encoding oligonucleotide of the at least one member of the library of 
compounds that binds to the target, and amplification reaction solution containing 
reagents necessary to perform nucleic acid amplification; 

(ii) amplifying the encoding oligonucleotide in the microreactors to 
form amplified copies of said encoding oligonucleotide; and 

(iii) binding the amplified copies of the encoding oUgonucleotide to the 
beads in the microreactors. 



4. The method of claim 1 , wherein said sequencing step (D) comprises: 

(i) annealing an effective amount of a sequencing primer to the 
ampUfied copies of the encoding oUgonucleotide and extending the sequencing primer 
with a polymerase and a predetermined nucleotide triphosphate to yield a sequencing 
product and, if the predetermined nucleotide triphosphate is incorporated onto a 3' end 
of said sequencing primer, a sequencing reaction byproduct; and 

(ii) identifying the sequencing reaction byproduct, thereby determining 
the sequence of the encoding oligonucleotide. 

5. A method for identifying one or more compounds which bind to a biological 
target, said method comprising: 

(A) synthesizing a library of compounds, wherein the compounds comprise a 
functional moiety comprising two or more building blocks which is operatively linked 
to an initial oHgonucleotide which identifies the stiiictiwe of the fimctional moiety by: 
(i) providing a solution comprising m initiator compounds, wherein m 
is an integer of 1 or greater, where the initiator compounds consist of a fimctional 
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moiety comprising n building blocks, where n is an integer of 1 or greater, which is 
operatively linked to an initial oligonucleotide which identifies the n building blocks; 

(ii) dividing the solution of step (i) into r reaction vessels, wherein r is 
an integer of 2 or greater, thereby producing r aliquots of the solution; 
5 (iii) reacting the initiator conipounds in each reaction vessel with one 

of r building blocks, thereby producing r ahquots comprising compounds consisting 
of a functional moiety comprising n+1 building blocks operatively linked to the initial 
oligonucleotide; and 

(iv) reacting the initial oligonucleotide in each aliquot with one of a set 
10 of r distinct incoming oligonucleotides in the presence of an enzyme which catalyzes 
the ligation of the incoming oligonucleotide and the initial oligonucleotide, under 
conditions suitable for enzymatic ligation of the incoming oligonucleotide and the 
initial oligonucleotide; thereby producing r aliquots of molecules consisting of a 
functional moiety comprising n+1 building blocks operatively linked to an elongated 
15 oligonucleotide which encodes the n+1 building blocks; 

(B) contacting the biological target with the library of compounds, or a 
portion thereof, under conditions suitable for at least one member of the Hbrary of 
compounds to bmd to the target; 

(C) removing Hbrary members that do not bind to the target; 

20 (D) sequencing the encoding oligonucleotides of the at least one member of 

, the library of compounds which binds to the target, wherein said sequencing 
comprises: 

(i) annealing an effective amount of a sequencing primer to the 
amplified copies of the encoding oligonucleotide and extending the sequencing primer 

25 with a polymerase aad a predetermined nucleotide triphosphate to yield a sequencing 
product and, if the predetermined nucleotide triphosphate is incorporated onto a 3' end 
of said sequencing primer, a sequencing reaction byproduct; and 

(ii) identifying the sequencing reaction byproduct, thereby determining 
the sequence of the encoding oligonucleotide; and 

30 (E) using the sequence of the encoding ohgonucleotide determined in step (D) 

to determine the structure of the functional moieties of the members of the library of 
compounds which bind to the biological target, thereby identifying one or more 
compounds which bind to the biological target. 
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6. The method of claim 5, further comprising amplifying the encoding 
oligonucleotides of the at least one member of the library of compounds which binds 
to the target. 

5 7. The method ofclaim 6, wherein said amplification of the encoding 

oligonucleotides is carried out by a method selected from the group consisting of: the 
polymerase chain reaction (PGR); transcription-based amplification, rapid 
amplification of cDNA ends, continuous flow amplification, and rolling circle 
amplification. 

10 

8. The method of any one of claims 1, 4, or 5, wherein said sequencing of the 
encoding oligonucleotides is carried out by a pyrophosphate-based sequencing 
reaction or a single molecule sequencing by synthesis method. 

15 9. The method of claim 8, wherein the sequencing reaction byproduct is PPi and 
a coupled sulfiirylase/luciferase reaction is used to generate light for detection. 

10. The method of any one of claims 1 or 5, further comprising the step of 
enriching for beads which bind amplified copies of the encoding oligonucleotide 

20 away from beads to which no encoding oligonucleotide is bound. 

1 1 . The method of claim 1 0, wherein the method for said enrichment step is 
selected from the group consisting of affinity purification, and electrophoresis. 

25 12. The method of claim 3, further comprising breaking the emulsion to retrieve 
one or more of the amplified copies of the encoding oligonucleotide. 

13. The method of claim 1 or 5, further comprising the step of 

(A)(v) combining two or more of the r aliquots, thereby producing a solution 
30 comprising molecules consisting of a functional moiety comprising n + 1 building 

blocks, which is operatively linked to an elongated oligonucleotide which encodes the 
n +1 building blocks. 



14. 



The method of claim 13, wherein r aUquots are combined. 

-118- 



wo 2007/053358 



PCT/US2006/041356 



15. The method of claim 13, wherein the steps (A)(i) to (A)(v) are conducted one 
or more times to yield cycles 1 to i, where i is an integer of 2 or greater, wherein in 
cycle s-l-1, where s is an integer of i-1 or less, the solution comprising m initiator 
compotmds of step (a) is the solution of step (e) of cycle s. 

16. The method of of claim 1 or 5, wherein at least one of building blocks is an 
amino acid. 



17. The method of claim 1 or 5, wherein the initial oligonucleotide ia a 
covalently coupled double-stranded oUgonucleotide. 

18. The method of claim 17, wherein the incoming oligonucleotide is a double- 
stranded oligonucleotide. 

19. The method of claim 1 or 5, wherein the initiator compounds comprise a 
linker moiety comprising a first functional group adapted to bond with a building 
block, a second functional group adapted to bond to the 5 'end of an oligonucleotide, 
and a third functional group adapted to bond to the 3 '-end of an oHgonucleotide. 

20. The method of claim 19, wherein the linker moiety is of the structure 




E 



B 

wherein 

A is a functional group adapted to bond to a building block; 

B is a functional group adapted to bond to the 5 '-end of an oUgonucleotide; 

C is a functional group adapted to bond to the 3 '-end of an oligonucleotide; 

S is an atom or a scaffold; 

D is a chemical structure that connects A to S; 



-119- 



wo 



2007/053358 

E is a chemical structure liiat connects B to S; and 
F is a chemical structure that connects C to S. 



PCT/US2006/041356 



2L The method of claim 20, wherein: 
5 A is an amino group; 

B is a phosphate group; and 
C is a phosphate group. 

22. The method of claim 20, wherein D, E and F are each, independently, an 
10 alkylene group or an oligo(ethylene glycol) group. 

23 . The method of claim 20, wherein S is a carbon atom, a nitrogen atom, a 
phosphorus atom, a boron atom, a phosphate group, a cyclic groupor a polycyclic 
group. 

15 

24. The method of claim 23, wherein the linker moiety is of the structure 



-0P(0) 2O- (CHzCHzO)^ OPO 3- 

-N (CH2)n- 

-0P(0) 2O. (CH2CH 20)p OPO 3- 




wherein each of n, m and p is, independently, an integer from 1 to about 20. 

20 

25. The method of claim 24, wherein each of n, m and p is independently an 
integer from 2 to eight. 

26. The method of claim 25, wherein each of n, m and p is independently an 
25 integer from 3 to 6. 
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27. The method of claim 24, wherein the linker moiety has the structure 



— HN 




28. The method of claim 1 or 5, wherein each of said initiator compounds 

5 comprises a reactive group and wherein each of said r building blocks comprises a 
complementary reactive group which is complementary to said reactive group. 

29. The method of claim 28, wherein the reactive group and the complementary 
reactive group are selected from the group consisting of an amino group ; a carboxyl 

10 group; a sulfonyl group; a phosphonyl group; an epoxide group; an aziridine group; 
and an isocyanate group. 

30. The method of claim 28, wherein reactive group and the the complementary 
reactive group are selected from the group consisting of a hydroxyl group ; a carboxyl 

15 group; a sulfonyl group; a phosphonyl group; an epoxide group; an aziridine group; 
and an isocyanate group. 

3 1 . The method of claim 28, wherein the reactive group and the complementary 
reactive group are selected from the group consisting of an amino group and an 

20 aldehyde or ketone group. 

32. The method of claim 28, wherein the reaction between the reactive group and 
the complementary reactive group is conducted under reducing conditions. 

25 33. The method of claim 28, wherein the reactive group and the complementary 
reactive group are selected from the group consisting of a phosphorous ylide group 
and an aldehyde or ketone group. 
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34. The method of claim 28, wherein the reactive group and the complementary 
reactive group react via cycloaddition to form a cycUc stmcture. 

35. The metliod of claim 34, wherein the reactive group and the complementary 
reactive group are selected from the group consisting of an alkyne and an azide. 

36. The method of claim 28, wherein the reactive group and the complementary 
functional group are selected from the group consisting of a halogenated 
heteroaromatic group and a nucleophile. 

37. The method of claim 36, wherein the halogenated heteroaromatic group is 
selected from the group consistmg of chlorinated pyrimidines, chlorinated triazines 
and chlorinated purines. 

38. The method of claim 36, wherein the nucleophile is an amino group. 

39. The method of claim 13, fiirther comprising following cycle i, the step of: 
(A)(vi) cyclizing one or more of the ftmctional moieties. 

40. The method of claim 39, wherein a ftmctional moiety of step (A)(vi) 
comprises an azido group and an alkynyl group. 

41 . The method of claim 40, wherein the ftmctional moiety is maintained xmder 
conditions suitable for cycloaddition of the azido group and the alkynyl group to form 
a triazole group, thereby forming a cyclic ftmctional moiety 

42. The method of claim 41 , wherein the cycloaddition reaction is conducted in 
the presence of a copper catalyst. 

43. The method of claim 42, wherein at least one of the one or more ftmctional 
moieties of step (f) comprises at least two sulfliydryl groups, and said ftmctional 
moiety is maintained under conditions suitable for reaction of the two sulfliydryl 
groups to form a disulfide group, thereby cyclicizing the ftmctional moiety. 
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44. The method of claim 1 or 5, wherein the initial oligonucleotide comprises a 
PGR primer sequence. 

5 45. The method of claim 1 3, wherein the incoming oligonucleotide of cycle i 
comprises a PGR closing primer. 

46. The method of claim 13, further comprising following cycle i, the step of 
(d) ligating an oligonucleotide comprising a closing PGR primer sequence to 

10 the encoding oligonucleotide. 

47. The method of claim 46, wherein the oligonucleotide comprising a closing 
PGR primer sequence is ligated to the encoding oligonucleotide in the presence of an 
enzyme which catalyzes said ligation. 

15 
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SEQUENCE LISTING 



<110> Praecis Pharmaceuticals, Inc. 

<120> METHODS FOR IDENTIFYING COMPOUNDS OF INTEREST 
USING ENCODED LIBRARIES 

<130> PPI-168 

<150> 60/731464 
<151> 2005-10-28 

<160> 890 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 1 

gcaacgaag _ 9 

<210> 2 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 3 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 3 

gcgtacaag 9 

<210> 4 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<400> 2 
tcgttgcca 



9 



<400> 4 
tgtacgcca 



9 



<210> 5 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 



<400> 5 
gctctgtag 



9 



<210> 6 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 6 

acagagcca 9 

<210> 7 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 8 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 8 

atggcacca 9 

<210> 9 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 10 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 10 

ggtcaacca 9 

<210> 11 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<400> 7 
gtgccatag 



9 



<400> 9 
gttgaccag 



9 
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<220> 

<223> synthetic construct 



<400> 11 
cgacttgac 



9 



<210> 12 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 12 

acgctgaac 9 

<210> 13 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<210> 14 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 14 

gactacgca 9 

<210> 15 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<210> 16 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 16 

atgctggca 9 

<210> 17 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<400> 13 
cgtagtcag 



9 



<400> 15 
ccagcatag 



9 
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<220> 

<223> synthetic construct 



<400> 17 
cctacagag 



9 



<210> 18 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 18 

ctgtaggca 9 

<210> 19 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<210> 20 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 20 

acgacttgc 9 

<210> 21 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 22 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 22 

actggagca 9 

<210> 23 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<400> 19 
ctgaacgag 



9 



<400> 21 
ctccagtag 



9 
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<220> 

<223> synthetic construct 



<400> 23 
taggtccag 



9 



<210> 24 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 24 

ggacctaca 9 

<210> 25 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<210> 26 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 26 

aacacgcct 9 

<210> 27 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 28 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 28 

tccaagcct 9 

<210> 29 

<211> 9 
<212> DNA 

<213> Artificial Sequence 



<400> 25 
gcgtgttgt 



9 



<400> 27 
gcttggagt 



9 
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<220> 

<223> synthetic construct 



<400> 29 
gtcaagcgt 



9 



<210> 30 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 30 

gcttgacct 9 

<210> 31 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 32 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 32 

gctcttgct 9 

<210> 33 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 31 
caagagcgt 



9 



<400> 33 
cagttcggt 



9 



<210> 34 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 



<400> 34 
cgaactgct 



9 



<210> 35 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 



<400> 35 
cgaaggagt 



9 



<210> 36 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 36 

tccttcgct 9 

<210> 37 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<210> 38 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 38 

aacaccgct 9 

<210> 39 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 40 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 40 

agcaacgct 9 

<210> 41 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<400> 37 
cggtgttgt 



9 



<400> 39 
cgttgctgt 



9 
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<220> 

<223> synthetic construct 



<400> 41 
ccgatctgt 



9 



<210> 42 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 42 

agatcggct 9 

<210> 43 ' 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 44 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 44 

gagaaggct 9 

<210> 45 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 45 

tgagtccgt 9 

<210> 46 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 46 

ggactcact 9 

<210> 47 
<211> 9 
<212> DMA 



<400> 43 
ccttctcgt 



9 
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<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 47 
tgctacggt 

<210> 48 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 48 
cgttagact 

<210> 49 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 49 
gtgcgttga 

<210> 50 

<211> 9 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 50 
aacgcacac 

<210> 51 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 51 
gttggcaga 

<210> 52 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<22Q> 

<223> synthetic construct 

<400> 52 
tgccaacac 

<210> 53 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> synthetic construct 



<400> 53 
cctgtagga 



9 



<210> 54 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 54 

ctacaggac 9 

<210> 55 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 55 

ctgcgtaga 9 

<210> 56 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<210> 57 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 57 

cttacgcga 9 

<210> 58 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<400> 56 
tacgcagac 



9 



<400> 58 
gcgtaagac 



9 



<210> 59 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 59 
tggtcacga 

<210> 60 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 60 
gtgaccaac 

<210> 61 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 61 
tcagagcga 

<210> 62 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 62 
gctctgaac 

<210> 63 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 63 
ttgctcgga 

<210> 64 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 64 
cgagcaaac 

<210> 65 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 65 
gcagttgga 

<210> 66 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 66 
caactgcac 

<210> 67 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 67 
gcctgaaga 

<210> 68 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 68 
ttcaggcac 

<210> 69 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 69 
gtagccaga 

<210> 70 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 70 
tggctacac 

<210> 71 
<211> 9 



-12- 



wo 2007/053358 



PCT/US2006/041356 



<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 71 
gtcgcttga 

<210> 72 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 72 
aagcgacac 

<210> 73 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 73 
gcctaagtt 

<210> 74 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 74 
cttaggctc 

<210> 75 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 75 
gtagtgctt 

<210> 76 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 76 
gcactactc 

<210> 77 
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<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 77 

gtcgaagtt 9 

<210> 78 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 79 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 79 

gtttcggtt 9 

<210> 80 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 81 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 81 

cagcgtttt 9 

<210> 82 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 78 
cttcgactc 



9 



<400> 80 
ccgaaactc 



9 



<400> 82 
aacgctgtc 



9 



<210> 83 
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<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 83 

catacgctt 9 

<210> 84 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 85 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 85 

cgatctgtt 9 

<210> 86 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 87 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 87 

cgctttgtt 9 

<210> 88 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 



<400> 84 
gcgtatgtc 



9 



<400> 86 
cagatcgtc 



9 



<400> 88 
caaagcgtc 



9 



-15- 



wo 2007/053358 



PCT/US2006/041356 



<210> 89 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 89 

ccacagttt 9 

<210> 90 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 91 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 91 

cctgaagtt 9 

<210> 92 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<210> 93 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 93 

ctgacgatt 9 

<210> 94 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 94 



<400> 90 
actgtggtc 



9 



<400> 92 
cttcaggtc 



9 
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tcgtcagtc 

<210> 95 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 95 
ctccacttt 

<210> 96 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 96 
agtggagtc 



<210> 97 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 97 
accagagcc 

<210> 98 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 98 
ctctggtaa 

<210> 99 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 99 
atccgcacc 

<210> 100 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



PCT/US2006/041356 
9 



9 



9 



9 



9 
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<400> 100 
tgcggataa 

<210> 101 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 101 
gacgacacc 

<210> 102 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 102 
tgtcgtcaa 

<210> 103 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 103 
ggatggacc 

<210> 104 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 104 
tccatccaa 

<210> 105 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 105 
gcagaagcc 

<210> 106 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 106 
cttctgcaa 

<210> 107 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 107 
gccatgtcc 

<210> 108 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 108 
acatggcaa 

<210> 109 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 109 
gtctgctcc 

<210> 110 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 110 
agcagacaa 

<210> 111 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 111 
cgacagacc 

<210> 112 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 112 
tctgtcgaa 

<210> 113 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 113 
cgctactcc 

<210> 114 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 114 
agtagcgaa 

<210> 115 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 115 
ccacagacc 

<210> 116 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 116 
tctgtggaa 

<210> 117 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 117 
cctctctcc 

<210> 118 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 118 
agagaggaa 

<210> 119 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 119 
ctcgtagcc 

<210> 120 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 120 
ctacgagaa 

<210> 121 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 121 

aaatcgatgt ggtcactcag 

<210> 122 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 122 

gagtgaccac atcgatttgg 

<210> 123 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 123 

aaatcgatgt ggactaggag 

<210> 124 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 124 

cctagtccac atcgatttgg 

<210> 125 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 125 

aaatcgatgt gccgtatgag 

<210> 126 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 126 

catacggcac atcgatttgg 

<210> 127 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 127 

aaatcgatgt gctgaaggag 

<210> 128 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 128 

ccttcagcac atcgatttgg 

<210> 129 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 129 

aaatcgatgt ggactagcag 

<210> 130 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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20 



20 



20 



20 



20 



20 
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<220> 

<223> synthetic construct 

<400> 130 

gctagtccac atcgatttgg 

<210> 131 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 131 

aaatcgatgt gcgctaagag 

<210> 132 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 132 

cttagcgcac atcgatttgg 

<210> 133 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 133 

aaatcgatgt gagccgagag 

<210> 134 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 134 

ctcggctcac atcgatttgg 

<210> 135 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 135 

aaatcgatgt gccgtatcag 

<210> 136 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 136 

gatacggcac atcgatttgg 

<210> 137 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 137 

aaatcgatgt gctgaagcag 

<210> 138 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 138 

gcttcagcac atcgatttgg 

<210> 139 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 139 

aaatcgatgt gtgcgagtag 

<210> 140 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 140 

actcgcacac atcgatttgg 

<210> 141 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 141 

aaatcgatgt gtttggcgag 

<210> 142 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 142 

cgccaaacac atcgatttgg 20 

<210> 143 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 143 

aaatcgatgt gcgctaacag 20 

<210> 144 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 144 

gttagcgcac atcgatttgg 20 

<210> 145 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<40Q> 145 

aaatcgatgt gagccgacag 

<210> 146 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 146 

gtcggctcac atcgatttgg 

<210> 147 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 147 

aaatcgatgt gagccgaaag 

<210> 148 
<211> 20 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 148 

ttcggctcac atcgatttgg 20 

<210> 149 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 149 

aaatcgatgt gtcggtagag 20 

<210> 150 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 150 

ctaccgacac atcgatttgg 20 

<210> 151 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 151 

aaatcgatgt ggttgccgag 20 

<210> 152 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 152 

cggcaaccac atcgatttgg 20 

<210> 153 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 153 

aaatcgatgt gagtgcgtag 20 
<210> 154 
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<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 154 

acgcactcac atcgatttgg 

<210> 155 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 155 

aaatcgatgt ggttgccaag 

<210> 156 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 156 

tggcaaccac atcgatttgg 

<210> 157 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 157 

aaatcgatgt gtgcgaggag 

<210> 158 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 158 

cctcgcacac atcgatttgg 

<210> 159 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 159 

aaatcgatgt ggaacacgag 



20 
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<210> 160 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 160 

cgtgttccac atcgatttgg 20 

<210> 161 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 161 

aaatcgatgt gcttgtcgag 20 

<210> 162 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 162 

cgacaagcac atcgatttgg 20 

<210> 163 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 163 

aaatcgatgt gttccggtag 20 

<210> 164 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 164 

accggaacac atcgatttgg 20 

<210> 165 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 165 

aaatcgatgt gtgcgagcag 20 
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<210> 166 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 166 

gctcgcacac atcgatttgg 20 

<210> 167 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 167 

aaatcgatgt ggtcaggtag 20 

<210> 168 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 168 

acctgaccac atcgatttgg 20 

<210> 169 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 169 

aaatcgatgt ggcctgttag 20 

<210> 170 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 170 

aacaggccac atcgatttgg 20 

<210> 171 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 171 

aaatcgatgt ggaacaccag 

<210> 172 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 172 

ggtgttccac atcgatttgg 

<210> 173 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 173 

aaatcgatgt gcttgtccag 

<210> 174 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 174 

ggacaagcac atcgatttgg 

<210> 175 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 175 

aaatcgatgt gtgcgagaag 

<210> 176 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 176 

tctcgcacac atcgatttgg 

<210> 177 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



PCT/US2006/041356 
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<400> 177 

aaatcgatgt gagtgcggag 20 

<210> 178 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 178 

ccgcactcac atcgatttgg 20 

<210> 179 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 179 

aaatcgatgt gttgtccgag 20 

<210> 180 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 180 

cggacaacac atcgatttgg 20 

<210> 181 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 181 

aaatcgatgt gtggaacgag 20 

<210> 182 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 182 

cgttccacac atcgatttgg 20 

<210> 183 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 183 

aaatcgatgt gagtgcgaag 20 

<210> 184 
<211> 20 
<212> DUK 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 184 

tcgcactcac atcgatttgg 20 



<210> 185 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 185 

aaatcgatgt gtggaaccag 20 

<210> 186 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 186 

ggttccacac atcgatttgg 20 

<210> 187 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 187 

aaatcgatgt gttaggcgag 20 

<210> 188 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 188 

cgcctaacac atcgatttgg 20 

<210> 189 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 189 

aaatcgatgt ggcctgtgag 20 

<210> 190 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 190 

cacaggccac atcgatttgg 20 

<210> 191 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 191 

aaatcgatgt gctcctgtag 20 

<210> 192 

<211> 20 
<212> DNA 

<213> Artificial Sequence 
<22d> 

<223> synthetic construct 

<400> 192 

acaggagcac atcgatttgg 20 

<210> 193 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 193 

aaatcgatgt ggtcaggcag 20 

<210> 194 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 194 

gcctgaccac atcgatttgg 20 

<210> 195 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 
<400> 195 

aaatcgatgt ggtcaggaag 20 

<210> 196 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 196 

tcctgaccac atcgatttgg 20 

<21G> 197 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 197 

aaatcgatgt ggtagccgag 20 

<210> 198 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 198 

cggctaccac atcgatttgg 20 

<210> 199 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 199 

aaatcgatgt ggcctgtaag 20 

<210> 200 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 200 

tacaggccac atcgatttgg 20 

<210> 201 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 
<400> 201 

aaatcgatgt gctttcggag 20 

<210> 202 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 202 

ccgaaagcac atcgatttgg 20 

<210> 203 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 203 

aaatcgatgt gcgtaaggag 20 

<210> 204 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 204 

ccttacgcac atcgatttgg 20 

<210> 205 » 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 205 

aaatcgatgt gagagcgtag 20 

<210> 206 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 206 

acgctctcac atcgatttgg 20 

<210> 207 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 
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<223> synthetic construct 
<400> 207 

aaatcgatgt ggacggcaag 

<210> 208 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 208 

tgccgtccac atcgatttgg 

<210> 209 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 209 

aaatcgatgt gctttcgcag 

<210> 210 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 210 

gcgaaagcac atcgatttgg 

<210> 211 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 211 

aaatcgatgt gcgtaagcag 

<210> 212 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 212 

gcttacgcac atcgatttgg 

<210> 213 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 213 

aaatcgatgt ggctatggag 20 

<210> 214 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<22G> 

<223> synthetic construct 

<400> 214 

ccatagccac atcgatttgg 20 

<210> 215 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 215 

aaatcgatgt gactctggag 20 

<210> 216 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 216 

ccagagtcac atcgatttgg 20 

<210> 217 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 217 

aaatcgatgt gctggaaag 19 

<210> 218 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<40Q> 218 

ttccagcaca tcgatttgg 19 

<210> 219 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 219 

aaatcgatgt gccgaagtag 20 

<210> 220 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 220 

acttcggcac atcgatttgg 20 

<210> 221 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 221 

aaatcgatgt gctcctgaag 20 

<210> 222 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 222 

tcaggagcac atcgatttgg 20 



<210> 223 
<211> 20 
<212> DNA 



<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 223 

aaatcgatgt gtccagtcag 20 

<210> 224 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 224 

gactggacac atcgatttgg . 20 

<210> 225 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 225 

aaatcgatgt gagagcggag 20 

<210> 226 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 226 

ccgctctcac atcgatttgg 20 

<210> 227 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 227 

aaatcgatgt gagagcgaag 20 

<210> 228 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 228 

tcgctctcac atcgatttgg 20 

<210> 229 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 229 

aaatcgatgt gccgaaggag 20 

<210> 230 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 
<400> 230 

ccttcggcac atcgatttgg 20 

<210> 231 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 231 

aaatcgatgt gccgaagcag 20 

<210> 232 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 232 

gcttcggcac atcgatttgg 20 

<210> 233 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 233 

aaatcgatgt gtgttccgag 20 

<210> 234 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 234 

cggaacacac atcgatttgg 20 

<210> 235 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 235 

aaatcgatgt gtctggcgag 20 

<210> 236 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 236 

cgccagacac atcgatttgg 20 

<210> 237 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 237 

aaatcgatgt gctatcggag 20 

<210> 238 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 238 

ccgatagcac atcgatttgg 20 

<210> 239 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 239 

aaatcgatgt gcgaaaggag 20 

<210> 240 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 240 

cctttcgcac atcgatttgg 20 

<210> 241 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 241 

aaatcgatgt gccgaagaag 20 

<210> 242 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 242 

tcttcggcac atcgatttgg 20 

<210> 243 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 243 

aaatcgatgt ggttgcagag 

<210> 244 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 244 

ctgcaaccac atcgatttgg 

<210> 245 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 245 

aaatcgatgt ggatggtgag 

<210> 246 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 246 

caccatccac atcgatttgg 

<210> 247 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 247 

aaatcgatgt gctatcgcag 

<210> 248 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 248 

gcgatagcac atcgatttgg 

<210> 249 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 249 

aaatcgatgt gcgaaagcag 

<210> 250 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 250 

gctttcgcac atcgatttgg 

<210> 251 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 251 

aaatcgatgt gacactggag 

<210> 252 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 252 

ccagtgtcac atcgatttgg 

<210> 253 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 253 

aaatcgatgt gtctggcaag 

<210> 254 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 254 

tgccagacac atcgatttgg 

<210> 255 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 255 

aaatcgatgt ggatggtcag 

<210> 256 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 256 

gaccatccac atcgatttgg 

<210> 257 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 257 

aaatcgatgt ggttgcacag 

<210> 258 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 258 

gtgcaaccac atcgatttgg 

<210> 259 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 259 

aaatcgatgt gggcatcgag 

<210> 260 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 260 

cgatgcccca tccgatttgg 

<210> 261 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 261 

aaatcgatgt gtgcctccag 

<210> 262 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 262 

ggaggcacac atcgatttgg 

<210> 263 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 263 

aaatcgatgt gtgcctcaag 

<210> 264 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 264 

tgaggcacac atcgatttgg 

<210> 265 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 265 

aaatcgatgt gggcatccag 

<210> 266 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 266 

ggatgcccac atcgatttgg 

<210> 267 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 267 

aaatcgatgt gggcatcaag 

<210> 268 

<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 268 

tgatgcccac atcgatttgg 

<210> 269 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 269 

aaatcgatgt gcctgtcgag 

<210> 270 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 270 

cgacaggcac atcgatttgg 

<210> 271 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 271 

aaatcgatgt ggacggatag 

<210> 272 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 272 

atccgtccac atcgatttgg 

<210> 273 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 273 

aaatcgatgt gcctgtccag 20 

<210> 274 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<4G0> 274 

ggacaggcac atcgatttgg 20 

<210> 275 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 275 

aaatcgatgt gaagcacgag 20- 

<210> 276 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 276 

cgtgcttcac atcgatttgg 20 



<210> 277 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 277 

aaatcgatgt gcctgtcaag 20 

<210> 278 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 278 

tgacaggcac atcgatttgg 20 
<210> 279 



-47- 



wo 2007/053358 



PCT/US2006/041356 



<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 279 

aaatcgatgt gaagcaccag 20 

<210> 280 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 280 

ggtgcttcac atcgatttgg 20 

<210> 281 

<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 281 

aaatcgatgt gccttcgtag 20 

<210> 282 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 282 

acgaaggcac atcgatttgg 20 

<210> 283 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 283 

aaatcgatgt gtcgtccgag 20 

<210> 284 

<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 284 

cggacgacac atcgatttgg 20 
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<210> 285 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 285 

aaatcgatgt ggagtctgag 

<210> 286 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 286 

cagactccac atcgatttgg 

<210> 287 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 287 

aaatcgatgt gtgatccgag 

<210> 288 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 288 

cggatcacac atcgatttgg 

<210> 289 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 289 

aaatcgatgt gtcaggcgag 

<210> 290 

<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 290 

cgcctgacac atcgatttgg 
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<210> 291 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 291 

aaatcgatgt gtcgtccaag 

<210> 292 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 292 

tggacgacac atcgatttgg 

<210> 293 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 293 

aaatcgatgt ggacggagag 

<210> 294 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 294 

ctccgtccac atcgatttgg 

<210> 295 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 295 

aaatcgatgt ggtagcagag 

<210> 296 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 296 

ctgctaccac atcgatttgg 
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<210> 297 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 297 

aaatcgatgt ggctgtgtag 

<210> 298 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 298 

acacagccac atcgatttgg 

<210> 299 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 299 

aaatcgatgt ggacggacag 

<210> 300 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 300 

gtccgtccac atcgatttgg 

<210> 301 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 301 

aaatcgatgt gtcaggcaag 

<210> 302 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 302 

tgcctgacac atcgatttgg 



PCT/US2006/041356 



20 



20 



20 



20 



20 



-51 - 



wo 2007/053358 



PCT/US2006/041356 



<210> 303 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 303 

aaatcgatgt ggctcgaaag 20 

<210> 304 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 304 

ttcgagccac atcgatttgg 20 

<210> 305 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 305 

aaatcgatgt gccttcggag 20 

<210> 306 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 306 

ccgaaggcac atcgatttgg 20 

<210> 307 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 307 

aaatcgatgt ggtagcacag 20 

<210> 308 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 308 

gtgctaccac atcgatttgg 20 
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<210> 309 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 309 

aaatcgatgt ggaaggtcag 20 

<210> 310 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 310 

gaccttccac atcgatttgg 20 

<210> 311 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 311 

aaatcgatgt ggtgctgtag 20 

<210> 312 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 
<400> 312 

acagcaccac atcgatttgg 20 

<210> 313 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 313 

gttgcctgt 9 

<210> 314 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 314 
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aggcaacct 

<210> 315 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 315 
caggacggt 

<210> 316 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 316 
cgtcctgct 

<210> 317 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 317 
agacgtggt 

<210> 318 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 318 
cacgtctct 

<210> 319 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 319 
caggaccgt 

<210> 320 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 320 
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ggtcctgct 

<210> 321 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 321 
caggacagt 

<210> 322 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 322 
tgtcctgct 

<210> 323 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 323 
cactctggt 

<210> 324 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 324 
cagagtgct 

<210> 325 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 325 
gacggctgt 

<210> 326 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 326 
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agccgtcct 

<210> 327 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 327 
cactctcgt 

<210> 328 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 328 
gagagtgct 

<210> 329 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 329 
gtagcctgt 

<210> 330 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 330 
aggctacct 

<210> 331 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 331 
gccacttgt 

<21G> 332 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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<400> 332 
aagtggcct 

<210> 333 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 333 
catcgctgt 

<210> 334 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 334 
agcgatgct 

<210> 335 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 335 
cactggtgt 

<210> 336 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 336 
accagtgct 

<210> 337 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 337 
gccactggt 

<210> 338 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 338 
cagtggcct 

<210> 339 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 339 
tctggctgt 

<210> 340 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 340 
agccagact 

<210> 341 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 341 
gccactcgt 

<210> 342 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 342 
gagtggcct 

<210> 343 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 343 
tgcctctgt 

<210> 344 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



-58- 



wo 2007/053358 

<400> 344 
agaggcact 

<210> 345 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 345 
catcgcagt 

<210> 346 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 346 
tgcgatgct 

<210> 347 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 347 
caggaaggt 

<210> 348 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 348 
cttcctgct 

<210> 349 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 349 
ggcatctgt 

<210> 350 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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<400> 350 
agatgccct 

<210> 351 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 351 
cggtgctgt 

<210> 352 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 352 
agcaccgct 

<210> 353 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 353 
cactggcgt 

<210> 354 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 354 
gccagtgct 

<210> 355 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 355 
tctcctcgt 

<210> 356 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 356 
gaggagact 

<210> 357 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 357 
cctgtctgt 

<210> 358 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 358 
agacaggct 

<210> 359 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 359 
caacgctgt 

<210> 360 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 360 
agcgttgct 

<210> 361 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 361 
tgcctcggt 

<210> 362 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



-61 - 



wo 2007/053358 

<400> 362 
cgaggcact 

<210> 363 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 363 
acactgcgt 

<210> 364 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 364 
gcagtgtct 

<210> 365 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 365 
tcgtcctgt 

<210> 366 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 366 
aggacgact 

<210> 367 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 367 
gctgccagt 

<210> 368 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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<400> 368 
tggcagcct 

<210> 369 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 369 
tcaggctgt 

<210> 370 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 370 
agcctgact 

<210> 371 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 371 
gccaggtgt 

<210> 372 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 372 
acctggcct 

<210> 373 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 373 
cggacctgt 

<210> 374 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 374 
aggtccgct 

<210> 375 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 375 
caacgcagt 

<210> 376 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 376 
tgcgttgct 

<210> 377 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 377 
cacacgagt 

<210> 378 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 378 
tcgtgtgct 

<210> 379 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 379 
atggcctgt 

<210> 380 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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<400> 380 
aggccatct 

<210> 381 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 381 
ccagtctgt 

<210> 382 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 382 
agactggct 

<210> 383 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 383 
gccaggagt 

<210> 384 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 384 
tcctggcct 

<210> 385 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 385 
cggaccagt 

<210> 386 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 386 
tggtccgct 

<210> 387 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 387 
ccttcgcgt 

<210> 388 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 388 
gcgaaggct 

<210> 389 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 389 
gcagccagt 

<210> 390 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 390 
tggctgcct 

<210> 391 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 391 
ccagtcggt 

<210> 392 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 392 
cgactggct 

<210> 393 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 393 
actgagcgt 

<210> 394 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 394 
gctcagtct 

<210> 395 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 395 
ccagtccgt 

<210> 396 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 396 
ggactggct 

<210> 397 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 397 
ccagtcagt 

<210> 398 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 398 
tgactggct 

<210> 399 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 399 
catcgaggt 

<210> 400 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 400 
ctcgatgct 



<210> 401 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 401 
ccatcgtgt 

<210> 402 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 402 
acgatggct 

<210> 403 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 403 
gtgctgcgt 

<210> 404 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 404 
gcagcacct 

<210> 405 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 405 
gactacggt 

<210> 406 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 406 
cgtagtcct 

<210> 407 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 407 
gtgctgagt 

<210> 408 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 408 
tcagcacct 

<210> 409 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 409 
gctgcatgt 

<210> 410 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic cojastruct 

<400> 410 
atgcagcct 



<210> 411 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 411 
gagtggtgt 

<210> 412 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic cojistr-uct 

<:400> 412 
accactcct 

<210> 413 
<211> 9 
<212> DNA 

<213> Artificial Secjuence 
<220> 

<223> synthetic construct 

<40D> 413 
gactaccgt 

<210> 414 
<211> 9 
<212> DNA 

<213> Artificial Seqaeuc^ 
<220> 

<:223> synthetic construct 

<4Q0> 414 
ggtagtcct 

<210> 415 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 415 
cggtgatgt 

<210> 416 
<211> 9 
<212:> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 416 
atcaccgct 

<210> 417 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 417 
tgcgactgt 

<210> 418 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 418 
agtcgcact 

<210> 419 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 419 
tctggaggt 

<210> 420 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 420 
ctccagact 

<210> 421 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 421 
agcactggt 

<210> 422 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 422 
cagtgctct 

<210> 423 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 423 
tcgcttggt 

<210> 424 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 424 
caagcgact 

<210> 425 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 425 
agcactcgt 

<210> 426 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 426 
gagtgctct 

<210> 427 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 427 
gcgattggt 

<210> 428 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 428 
caatcgcct 

<210> 429 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 429 
ccatcgcgt 

<210> 430 
<211> 9 
<212> Dm 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 430 
gcgatggct 

<210> 431 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 431 
tcgcttcgt 

<210> 432 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 432 
gaagcgact 

<210> 433 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 433 
agtgcctgt 

<210> 434 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 434 
aggcactct 

<210> 435 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 435 
ggcataggt 

<210> 436 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 436 
ctatgccct 

<210> 437 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 437 
gcgattcgt 

<210> 438 
<211> 9 
<212> DNA 



<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 438 

gaatcgcct 9 

<210> 439 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 439 



tgcgacggt 9 

<210> 440 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 440 
cgtcgcact 



<210> 441 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 441 
gagtggcgt 

<210> 442 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 442 
gccactcct 

<210> 443 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 443 
cggtgaggt 

<210> 444 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 444 
ctcaccgct 

<210> 445 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 445 
gctgcaagt 

<210> 446 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 446 
ttgcagcct 

<210> 447 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 447 
ttccgctgt 

<210> 448 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 448 
agcggaact 

<210> 449 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 449 

gagtggagt 

<210> 450 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 450 
tccactcct 



<210> 451 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 451 

acagagcgt 9 
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<210> 452 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 452 
gctctgtct 

<210> 453 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 453 
tgcgaccgt 

<210> 454 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 454 
ggtcgcact 

<210> 455 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 455 
cctgtaggt 

<210> 456 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 456 
ctacaggct 

<210> 457 
<2ai> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 457 
tagccgtgt 
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<210> 458 
<211> 9 
<212> DNA 



<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 458 
acggctact 

<210> 459 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 459 

tgcgacagt 

<210> 460 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 460 
tgtcgcact 

<210> 461 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 461 
ggtctgtgt 

<210> 462 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<400> 462 

acagaccct 9 

<210> 463 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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<400> 463 

cggtgaagt 9 

<210> 464 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 464 

ttcaccgct 9 

<210> 465 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 465 
caacgaggt 

<210> 466 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 466 
ctcgttgct 

<210> 467 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 467 
gcagcatgt 

<210> 468 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 468 
atgctgcct 

<210> 469 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 469 
tcgtcaggt 

<210> 470 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 470 
ctgacgact 

<210> 471 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 471 
agtgccagt 

<210> 472 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 472 
tggcactct 

<210> 473 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 473 
tagaggcgt 

<210> 474 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 474 
gcctctact 

<210> 475 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> synthetic construct 
<400> 475 

gtcagcggt g 

<210> 476 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 476 

cgctgacct 9 

<210> 477 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 477 

tcaggaggt g 

<210> 478 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 478 

ctcctgact g 

<210> 479 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 479 



agcaggtgt 9 

<210> 480 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 480 

acctgctct 9 

<210> 481 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 481 

ttccgcagt 9 

<210> 482 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 482 

tgcggaact 9 

<210> 483 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 483 

gtcagccgt 9 

<210> 484 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 484 

ggctgacct 9 

<210> 485 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 485 
ggtctgcgt 

<210> 486 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 486 
gcagaccct 

<210> 487 
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<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 487 
tagccgagt 

<210> 488 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 488 
tcggctact 

<210> 489 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 489 

gtcagcagt 

<210> 490 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 490 
tgctgacct 

<210> 491 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 491 
ggtctgagt 

<210> 492 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 492 
tcagaccct 
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<210> 493 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 493 
cggacaggt 

<210> 494 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 494 
ctgtccgct 

<210> 495 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 495 
ttagccggt 

<210> 496 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 496 
cggctaact 

<210> 497 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 497 
gagacgagt 

<210> 498 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 498 
tcgtctcct 

<210> 499 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 499 

cgtaaccgt 

<210> 500 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 500 
ggttacgct 

<210> 501 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 501 
ttggcgtgt 

<210> 502 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 502 
acgccaact 

<210> 503 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 503 
atggcaggt 

<210> 504 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 
<400> 504 

ctgccatct 9 

<210> 505 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 505 
cagctacga 

<210> 506 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 506 
gtagctgac 

<210> 507 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 507 
ctcctgcga 

<210> 508 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 508 
gcaggagac 

<210> 509 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 509 
gctgcctga 

<210> 510 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<223> synthetic construct 

<400> 510 
aggcagcac 

9 

<210> 511 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 511 
caggaacga 

9 

<210> 512 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 512 
gttcctgac 

9 

<210> 513 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> .synthetic construct 

<400> 513 
cacacgcga 

9 

<210> 514 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 514 
gcgtgtgac 

9 

<210> 515 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 515 
gcagcctga 

9 

<210> 516 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 516 
aggctgcac 

<210> 517 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 517 
ctgaacgga 

<210> 518 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 518 
cgttcagac 

<210> 519 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 519 
ctgaaccga 

<210> 520 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 520 
ggttcagac 

<210> 521 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 521 
tctggacga 

<210> 522 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 522 
gtccagaac 

<210> 523 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 523 
tgcctacga 

<210> 524 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 524 
gtaggcaac 

<210> 525 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 525 
ggcatacga 

<210> 526 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 526 
gtatgccac 

<210> 527 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 527 
cggtgacga 

<210> 528 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



-89- 



wo 2007/053358 



PCT/US2006/041356 



<220> 

<223> synthetic construct 

<400> 528 
gtcaccgac 

<210> 529 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 529 
caacgacga 

<210> 530 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 530 
gtcgttgac 

<210> 531 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 531 
ctcctctga 

<210> 532 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 532 
agaggagac 

<210> 533 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 533 
tcaggacga 

<210> 534 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 534 
gtcctgaac 

9 

<210> 535 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 535 
aaaggcgga 

9 

<210> 536 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 536 
cgcctttac 

9 

<210> .537 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> '.synthetic construct 

<400> 537 
ctcctcgga 

9 

<210> 538 
<211> 9 
<212> DNA 

<21-3> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 538 
cgaggagac 

9 

<210> 539 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 539 
cagatgcga 

9 

<210> 540 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 540 

gcatctgac 9 

<210> 541 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 541 

gcagcaaga 9 

<210> 542 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 542 
ttgctgcac 

<210> 543 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 543 
gtggagtga 

<210> 544 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 544 
actccacac 

<210> 545 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 545 
ccagtagga 

<210> 546 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 546 
ctactggac 

<210> 547 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 547 
atggcacga 

<210> 548 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 548 
gtgccatac 

<210> 549 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 549 
ggactgtga 

<210> 550 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 550 
acagtccac 

<210> 551 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 551 
ccgaactga 

<210> 552 
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<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 552 
agttcggac 

<210> 553 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 553 
ctcctcaga 

<210> 554 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 554 
tgaggagac 

<210> 555 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 555 
cactgctga 

<210> 556 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 556 
agcagtgac 

<210> 557 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 557 
agcaggcga 
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<210> 558 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 558 
gcctgctac 

<210> 559 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 559 
agcaggaga 

<210> 560 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 560 
tcctgctac 

<210> 561 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 561 
agagccaga 

<210> 562 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 562 
tggctctac 

<210> 563 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 563 
gtcgttgga 
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<210> 564 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 564 9 
caacgacac 

<210> 565 

<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 565 9 
ccgaacgga 

<210> 566 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 566 9 
cgttcggac 

<210> 567 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 567 9 
cactgcgga 

<210> 568 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 568 9 
cgcagtgac 

<210> 569 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 



<400> 569 
gtggagcga 



9 



wo 2007/053358 



PCT/US2006/041356 



<210> 570 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 570 
gctccacac 

<210> 571 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 571 
gtggagaga 

<210> 572 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 572 
tctccacac 

<210> 573 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 573 
ggactgcga 

<210> 574 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 574 
gcagtccac 

<210> 575 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 575 
ccgaaccga 
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<210> 576 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 576 
ggttcggac 

<210> 577 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 577 
cactgccga 

<210> 578 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 578 
ggcagtgac 

<210> 579 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 579 
cgaaacgga 

<210> 580 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 580 
cgtttcgac 

<210> 581 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 581 
ggactgaga 



-98- 



wo 2007/053358 



PCT/US2006/041356 



<210> 582 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 582 
tcagtccac 

<210> 583 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220>. 

<223> synthetic construct 

<400> 583 
ccgaacaga 

<210> 584 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 584 
tgttcggac 

<210> 585 
<211> 9 
<212> ■ DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 585 
cgaaaccga 

<210> 586 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 586 
ggtttcgac 

<210> 587 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 587 
ctggcttga 
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<210> 588 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 588 
aagccagac 

<210> 589 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 589 
cacacctga 

<210> 590 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 590 
aggtgtgac 

<210> 591 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 591 
aacgaccga 

<210> 592 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 592 
ggtcgttac 

<210> 593 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 593 
atccagcga 
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<210> 594 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 594 

gctggatac 9 

<210> 595 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 595 

tgcgaagga 9 

<210> 596 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 596 

cttcgcaac 9 

<210> 597 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 597 

tgcgaacga 9 

<210> 598 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 598 

gttcgcaac 9 

<210> 599 

<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 599 
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ctggctgga 

<210> 600 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 600 
cagccagac 

<210> 601 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 601 
cacaccgga 

<210> 602 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 602 
cggtgtgac 

<210> -603 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 603 
agtgcagga 

<210> 604 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 604 
ctgcactac 

<210> 605 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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gaccgttga 9 

<210> 606 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 606 

aacggtcac 9 

<210> 607 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 607 

ggtgagtga 9 

<210> 608 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 608 

actcaccac 9 

<210> '^609 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 609 

ccttcctga 9 

<210> 610 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 610 

aggaaggac 9 

<210> 611 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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<400> 611 
ctggctaga 

<210> 612 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



<400> 612 
tagccagac 

<210> 613 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 613 
cacaccaga 

<210> 614 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 614 
tggtgtgac 

<210> 615 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 615 
agcggtaga 

<210> 616 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 616 
taccgctac 

<210> 617 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 617 
gtcagagga 

<210> 618 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 618 
ctctgacac 

<210> 619 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 619 
ttccgacga 

<210> 620 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 620 
gtcggaaac 

<210> 621 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 621 
aggcgtaga 

<210> 622 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 622 
tacgcctac 

<210> 623 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 623 
ctcgactga 

<210> 624 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 624 
agtcgagac 

<210> 625 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 625 
tacgctgga 

<210> 626 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 626 
cagcgtaac 

<210> 627 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 627 
gttcggtga 

<210> 628 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 628 
accgaacac 

<210> 629 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 629 
gccagcaga 

<210> 630 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 630 
tgctggcac 

<210> 631 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 631 
gaccgtaga 

<210> 632 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 632 
tacggtcac 

<210> 633 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 633 
gtgctctga 

<210> 634 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 634 
agagcacac 

<210> 635 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 635 
ggtgagcga 

<21Q> 636 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 636 
gctcaccac 

<210> 637 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 637 
ggtgagaga 

<210> 638 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 638 
tctcaccac 

<210> 639 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 639 
ccttccaga 

<210> 640 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 640 
tggaaggac 

<210> 641 
<211> 9 

<212> DNA 

<213> Artificial Sequence 



PCT/US2006/041356 



9 



9 



9 



9 



9 



-108- 



wo 2007/053358 

<220> 

<223> synthetic construct 

<400> 641 
ctcctacga 

<210> 642 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 642 
gtaggagac 

<210> 643 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 643 
ctcgacgga 

<210> 644 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 644 
cgtcgagac 

<210> 645 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 645 
gccgtttga 

<210> 646 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 646 
aaacggcac 

<210> 647 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 647 
gcggagtga 

<210> 648 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 648 
actccgcac 

<210> 649 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 649 
cgtgcttga 

<210> 650 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 650 
aagcacgac 

<210> 651 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 651 
ctcgaccga 

<210> 652 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 652 
ggtcgagac 

<210> 653 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 653 
agagcagga 

<210> 654 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 654 
ctgctctac 

<210> 655 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 655 
gtgctcgga 

<210> 656 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> .656 
cgagcacac 

<210> 657 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 657 
ctcgacaga. 

<210> 658 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 658 
tgtcgagac 

<210> 659 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 659 
ggagagtga 

<210> 660 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 660 
actctccac 

<210> 661 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 661 
aggctgtga 

<210> 662 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 662 
acagcctac 

<210> 663 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 663 
agagcacga 

<210> 664 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 664 
gtgctctac 

<210> 665 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 665 9 
ccatcctga 

<210> 666 

<211> 9 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 666 9 
aggatggac 

<210> 667 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 667 
gttcggaga 

<210> 668 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 668 
tccgaacac 

<210> 669 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 

<400> 669 9 
tggtagcga 

<210> 670 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 670 g 
gctaccaac 

<210> 671 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 671 
gtgctccga 

<210> 672 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 672 
ggagcacac 

<210> 673 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 673 
gtgctcaga 

<210> 674 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 674 
tgagcacac 

<210> 675 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 675 
gccgttgga 

<210> 676 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 676 
caacggcac 

<210> 677 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 677 
gagtgctga 

<210> 678 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 678 
agcactcac 

<210> 679 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 679 
gctccttga 

<210> 680 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 680 
aaggagcac 

<210> 681 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 681 
ccgaaagga 

<210> 682 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 682 
ctttcggac 

<210> 683 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 683 

cactgagga 9 

<210> 684 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 684 

ctcagtgac 9 

<210> 685 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 685 

cgtgctgga 9 

<210> 686 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 686 

cagcacgac 9 

<210> 687 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 687 

ccgaaacga 9 

<210> 688 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 688 

gtttcggac 9 

<210> 689 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 689 
gcggagaga 

<210> 690 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 690 
tctccgcac 

<210> 691 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 691 
gccgttaga 

<210> 692 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 692 
taacggcac 

<210> 693 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 693 
tctcgtgga 

<210> 694 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<4D0> 694 
cacgagaac 

<210> 695 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 695 

cgtgctaga 9 

<210> 696 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 696 

tagcacgac 9 



<210> 697 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 697 
gcctgtctt 

<210> 698 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 698 
gacaggctc 

<210> 699 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 699 
ctcctggtt 

<210> 700 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 700 
ccaggagtc 

<210> 701 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 701 
actctgctt 

<210> 702 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 702 
gcagagttc 

<210> 703 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 703 
catcgcctt 

<210> 704 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 704 
ggcgatgtc 

<210> 705 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 705 
gccactatt 

<210> 706 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 706 
tagtggctc 



<210> 707 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 707 
cacacggtt 

<210> 708 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 708 
ccgtgtgtc 

<210> 709 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 709 
caacgcctt 

<210> 710 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 710 
ggcgttgtc 

<210> 711 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 711 
actgaggtt 

<210> 712 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 712 
cctcagttc 

<210> 713 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 713 
gtgctggtt 

<210> 714 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 714 
ccagcactc 

<210> 715 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 715 
catcgactt 

<210> 716 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 716 
gtcgatgtc 

<210> 717 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 717 
ccatcggtt 

<210> 718 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 718 
ccgatggtc 

<210> 719 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 719 
gctgcactt 

<210> 720 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 720 
gtgcagctc 

<210> 721 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 721 
acagaggtt 

<210> 722 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> . synthetic construct 

<400> 722 
cctctgttc 

<210> 723 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 723 
agtgccgtt 

<210> 724 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 724 
cggcacttc 

<210> 725 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 725 
cggacattt 

<210> 726 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 726 
atgtccgtc 

<210> 727 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 727 
ggtctggtt 

<210> 728 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 728 
ccagacctc 

<210> 729 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 729 
gagacggtt 

<210> 730 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 730 
ccgtctctc 

<210> 731 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 731 
ctttccgtt 

<210> 732 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 732 
cggaaagtc 

<210> 733 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 733 
cagatggtt 

<210> 734 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 734 
ccatctgtc 

<210> 735 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 735 

cggacactt 

<210> 736 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 736 
gtgtccgtc 
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<210> 737 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 737 
actctcgtt 

<210> 738 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 738 
cgagagttc 

<210> 739 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 739 
gcagcactt. 

<210> 740 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 740 
gtgctgctc 

<210> 741 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> synthetic construct 
<400> 741 

actctcctt 1 9 

<210> 742 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 742 

ggagagttc 9 
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<210> 743 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 743 

accttggtt 9 

<210> 744 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 744 

ccaaggttc 9 

<210> 745 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 745 

agagccgtt 9 

<210> 746 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 746 

cggctcttc 9 

<210> 747 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 747 

accttgctt 9 

<210> 748 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 748 

gcaaggttc 9 
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<210> 749 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 749 

aagtccgtt 9 

<210> 750 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 750 

cggactttc 9 

<210> 751 
<211> 9 
<212> DNA 

<213> Artificial Sequence 



<22G> 

<223> synthetic construct 

<400> 751 

ggactggtt 9 

<210> 752 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 752 

ccagtcctc 9 

<210> 753 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<4G0> 753 

gtcgttctt 9 

<210> 754 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 754 
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gaacgactc 

<210> 755 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 755 
cagcatctt 

<210> 756 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 756 
gatgctgtc 

<210> 757 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 757 
ctatccgtt 

<210> 758 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 758 
cggatagtc 

<210> 759 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 759 
acactcgtt 

<210> 760 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 760 
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9 



9 



9 



9 



9 
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cgagtgttc 

<210> 761 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 761 
atccaggtt 

<210> 762 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 762 
cctggattc 

<210> 763 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 763 
gttcctgtt 

<210> 764 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 764 
caggaactc 

<210> 765 
<211> 9 
<212> DNA • 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 765 
acactcctt 

<210> 766 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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9 
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<400> 766 
ggagtgttc 

<210> 767 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 767 
gttcctctt 

<210> 768 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 768 
gaggaactc 

<210> 769 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 769 
ctggctctt 

<210> 770 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 770 
gagccagtc 

<210> 771 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 771 
acggcattt 

<210> 772 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
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9 



9 



9 



9 



9 
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<400> 772 
atgccgttc 

<210> 773 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 773 
ggtgaggtt 

<210> 774 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 774 
cctcacctc 

<210> 775 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 775 
ccttccgtt 

<210> 776 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 776 
cggaaggtc 

<210> 777 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 777 
tacgctctt 

<210> 778 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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9 



9 



9 



9 
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<400> 778 
gagcgtatc 

<210> 779 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 779 
acggcagtt 

<210> 780 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 780 
ctgccgttc 

<210> 781 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 781 
actgacgtt 

<210> 782 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 782 
cgtcagttc 

<210> 783 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 783 
acggcactt 

<210> 784 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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9 



9 



9 



9 



9 
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<400> 784 
gtgccgttc 

<210> 785 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 785 
actgacctt 

<210> 786 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 786 
ggtcagttc 

<210> 787 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 787 
tttgcggtt 

<210> 788 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 788 
ccgcaaatc 

<210> 789 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 789 
tggtaggtt 

<210> 790 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
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9 

9 
9 
9 
9 
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<400> 790 
cctaccatc 

<210> 791 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 791 
gttcggctt 

<210> 792 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 792 
gccgaactc 

<210> 793 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 793 
gccgttctt 

<210> 794 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 794 
gaacggctc 

<210> 795 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 795 
ggagaggtt 

<210> 796 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 



PCT/US2006/041356 
9 



9 
9 

9 
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<400> 796 
cctctcctc 

9 

<210> 797 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 797 
cactgactt 

9 

<210> 798 

<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 798 

gtcagtgtc ^ 

<210> 799 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 799 
cgtgctctt 

9 

<210> 800 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 800 
gagcacgtc 

9 

<210> 801 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 801 
aatccgctt 

9 

<210> 802 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> synthetic construct 

<400> 802 
gcggatttc 

<210> 803 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 803 
aggctggtt 

<210> 804 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 804 
ccagccttc 

<210> 805 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 805 
gctagtgtt 

<210> 806 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 806 
cactagctc 

<210> 807 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 807 
ggagagctt 

<210> 808 
<211> 9 

<212> DNA 

<213> Artificial Sequence 



PCT/US2006/041356 



9 



9 



9 



9 



9 
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<220> 

<223> synthetic construct 



<400> 808 
gctctcctc 

<210> 809 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 809 
ggagagatt 

<210> 810 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 810 
tctctcctc 

<210> 811 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 811 
aggctgctt 

<210> 812 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 812 
gcagccttc 

<210> 813 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 813 
gagtgcgtt 

<210> 814 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 814 
cgcactctc 

<210> 815 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 815 
ccatccatt 

<210> 816 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 816 
tggatggtc 

<210> 817 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 817 
gctagtctt 

<210> 818 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 818 
gactagctc 

<210> 819 

<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 819 
aggctgatt 

<210> 820 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 820 
tcagccttc 

<210> 821 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 821 
acagacgtt 

<210> 822 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 822 
cgtctgttc 

<210> 823 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 823 
gagtgcctt 

<210> 824 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 824 
ggcactctc 

<210> 825 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 825 
acagacctt 

<210> 826 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 826 

ggtctgttc 9 

<210> 827 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 827 

cgagctttt 9 

<210> 828 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 828 

aagctcgtc 9 

<210> 829 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 829 

ttagcggtt 9 

<210> 830 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 830 

ccgctaatc 9 

<210> 831 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 831 

cctcttgtt 9 

<210> 832 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 
<400> 832 

caagaggtc 9 

<210> 833 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 833 

ggtctcttt 9 

<210> 834 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 834 

agagacctc 9 

<210> 835 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 835 

gccagattt 9 

<210> 836 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 836 

atctggctc 9 

<210> 837 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 837 

gagaccttt 9 

<210> 838 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 838 
aggtctctc 

. <210> 839 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 839 
cacacagtt 

<210> 840 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<:223> synthetic construct 

<400> 840 
ctgtgtgtc 

<210> 841 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 841 
cctcttctt 

<210> 842 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 842 
gaagaggtc 

<210> 843 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<4Q0> 843 
tagagcgtt 

<210> 844 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 844 
cgctctatc 

<210> 845 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 845 
gcacctttt 

<210> 846 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 846 
aaggtgctc 

<210> 847 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 847 
ggcttgttt 

<210> 848 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 848 
acaagcctc 

<210> 849 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 849 
gacgcgatt 

<210> 850 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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9 



9 



9 
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<220> 

<223> synthetic construct 

<400> 850 
tcgcgtctc 

9 

<210> 851 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 851 
cgagctgtt 

9 

<210> 852 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2.23> synthetic construct 

<400> 852 
cagctcgtc 

9 

<210> 853 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 853 
tagagcctt 

9' 

<210> 854 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 854 
ggctctatc 

9 

<210> 855 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 855 
catccgttt 

9 

<210> 856 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 856 
acggatgtc 

9 

<210> 857 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 857 
ggtctcgtt 

9 

<210> 858 
<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 858 
cgagacctc 

9 

<210> 859 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 859 
gccagagtt 

9 

<210> 860 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 860 
ctctggctc 

9 

<210> 861 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 861 
gagaccgtt 

9 

<210> 862 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> synthetic construct 

<400> 862 
cggtctctc 

<210> 863 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 863 
cgagctatt 

<210> 864 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 864 
tagctcgtc 

<210> 865 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 865 
gcaagtgtt 

<210> 866 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 866 
cacttgctc 



<210> 867 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 867 

ggtctcctt 9 

<210> 868 
<211> 9 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 868 
ggagacctc 

<210> 869 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 869 
gccagactt 

<210> 870 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 870 
gtctggctc 

<210> 871 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 871 
ggtctcatt 

<210> 872 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 872 
tgagacctc 

<210> 873 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 873 
gagaccatt 

<210> 874 
<211> 9 
<212> DNA 
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9 
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9 



9 
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<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 874 

tggtctctc 9 

<210> 875 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 875 

ccttcagtt 9 

<210> 876 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 876 

ctgaaggtc 9 

f 

<210> 877 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 877 

gcacctgtt 9 

<210> 878 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 



<400> 878 

caggtgctc 9 

<210> 879 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 879 

aaaggcgtt 9 

<210> 880 
<211> 9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 880 
cgccttttc 

<210> 881 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 881 
cagatcgtt 

<210> 882 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 882 
cgatctgtc 

<210> 883 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 883 
cataggctt 

<210> 884 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 884 
gcctatgtc 

<210> 885 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 885 
ccttcactt 

<210> 886 
<211> 9 
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9 



9 



9 



9 



9 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 886 
gtgaaggtc 

<210> 887 
<211> 9 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 

<400> 887 
gcacctctt 

<210> 888 
<211> 9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 

<400> 888 
gaggtgctc 

<210> 889 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> synthetic construct 
<400> 889 

cagaagacag acaagcttca cctgc 

<210> 890 
<211> 27 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> synthetic construct 
<400> 890 

gcaggtgaag cttgtctgtc ttctgaa 
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9 



9 



9 



25 
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